[khmer] Counting kmers and disabling reverse complement

Jordan Fish jrdn.fish at gmail.com
Fri Jun 14 05:53:22 PDT 2013


Hi Lester,

Unless you are working with fairly small k-values you will probably want to
use the CountingHash.  Ktable handles simple exact counting so far
large-ish values of k (>12, according to
http://khmer.readthedocs.org/en/latest/ktable.html) it'll blow up.

The counting hash uses a bloom filter to limit memory usage at the cost of
in-exact counting.  Hopefully titus will jump in here with a link to some
documentation on the inexact counting.

Finally, if you want to force khmer to treat a kmer and it's reverse
complement as unique you will need to edit 'lib/Makefile' and change the
line

NO_UNIQUE_RC=0

to

NO_UNIQUE_RC=1

and rebuild khmer

Jordan

On Fri, Jun 14, 2013 at 3:22 AM, Lester Mackey <lmackey at stanford.edu> wrote:

> Dear khmer Discussion List,
>
> If my goal is to obtain a vector of kmer counts quickly from a FASTA or
> FASTQ file, is there any reason to prefer ktable to one of your other data
> structures, like the counting hash table?
>

> I've noticed that ktable hashes a kmer and its reverse complement to the
> same bin.  Is there an easy way to disable this feature (and thereby count
> each kmer and reverse complement separately)?
>
> Thanks,
> Lester
>
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130614/78754ec0/attachment-0001.htm>


More information about the khmer mailing list