[khmer] Counting kmers and disabling reverse complement
Jordan Fish
jrdn.fish at gmail.com
Fri Jun 14 05:53:22 PDT 2013
Hi Lester,
Unless you are working with fairly small k-values you will probably want to
use the CountingHash. Ktable handles simple exact counting so far
large-ish values of k (>12, according to
http://khmer.readthedocs.org/en/latest/ktable.html) it'll blow up.
The counting hash uses a bloom filter to limit memory usage at the cost of
in-exact counting. Hopefully titus will jump in here with a link to some
documentation on the inexact counting.
Finally, if you want to force khmer to treat a kmer and it's reverse
complement as unique you will need to edit 'lib/Makefile' and change the
line
NO_UNIQUE_RC=0
to
NO_UNIQUE_RC=1
and rebuild khmer
Jordan
On Fri, Jun 14, 2013 at 3:22 AM, Lester Mackey <lmackey at stanford.edu> wrote:
> Dear khmer Discussion List,
>
> If my goal is to obtain a vector of kmer counts quickly from a FASTA or
> FASTQ file, is there any reason to prefer ktable to one of your other data
> structures, like the counting hash table?
>
> I've noticed that ktable hashes a kmer and its reverse complement to the
> same bin. Is there an easy way to disable this feature (and thereby count
> each kmer and reverse complement separately)?
>
> Thanks,
> Lester
>
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130614/78754ec0/attachment-0001.htm>
More information about the khmer
mailing list