[khmer] Counting kmers and disabling reverse complement

Lester Mackey lmackey at stanford.edu
Sun Jun 16 02:49:48 PDT 2013


P.S. By 'python directory' in my last message I really meant the main khmer
directory and the Makefile therein.  Apologies for any confusion.


On Sun, Jun 16, 2013 at 1:09 AM, Lester Mackey <lmackey at stanford.edu> wrote:

> Hi Jordan,
>
> If, after running 'make clean' in all directories, I set NO_UNIQUE_RC=1 in
> 'lib/Makefile' and then run 'make' in the higher level python directory,
> the Makefile fails to add -dNO_UNIQUE_RC=1 to CXXFLAGS when compiling the
> code in 'lib' (and so the reverse complement and kmer are still treated
> identically).
>
> I've gotten around this by adding -dNO_UNIQUE_RC=1 to CXXFLAGS explicitly
> in python/Makefile, but I wanted to let you know in case this was not the
> desired behavior.
>
> Lester
>
>
> On Fri, Jun 14, 2013 at 5:53 AM, Jordan Fish <jrdn.fish at gmail.com> wrote:
>
>> Hi Lester,
>>
>> Unless you are working with fairly small k-values you will probably want
>> to use the CountingHash.  Ktable handles simple exact counting so far
>> large-ish values of k (>12, according to
>> http://khmer.readthedocs.org/en/latest/ktable.html) it'll blow up.
>>
>> The counting hash uses a bloom filter to limit memory usage at the cost
>> of in-exact counting.  Hopefully titus will jump in here with a link to
>> some documentation on the inexact counting.
>>
>> Finally, if you want to force khmer to treat a kmer and it's reverse
>> complement as unique you will need to edit 'lib/Makefile' and change the
>> line
>>
>> NO_UNIQUE_RC=0
>>
>> to
>>
>> NO_UNIQUE_RC=1
>>
>> and rebuild khmer
>>
>> Jordan
>>
>> On Fri, Jun 14, 2013 at 3:22 AM, Lester Mackey <lmackey at stanford.edu>wrote:
>>
>>> Dear khmer Discussion List,
>>>
>>>  If my goal is to obtain a vector of kmer counts quickly from a FASTA or
>>> FASTQ file, is there any reason to prefer ktable to one of your other data
>>> structures, like the counting hash table?
>>>
>>
>>> I've noticed that ktable hashes a kmer and its reverse complement to the
>>> same bin.  Is there an easy way to disable this feature (and thereby count
>>> each kmer and reverse complement separately)?
>>>
>>> Thanks,
>>> Lester
>>>
>>> _______________________________________________
>>> khmer mailing list
>>> khmer at lists.idyll.org
>>> http://lists.idyll.org/listinfo/khmer
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130616/fc746990/attachment-0002.htm>


More information about the khmer mailing list