[khmer] counting 31-mers with khmer

Qingpeng Zhang qingpeng at gmail.com
Thu Jun 20 13:21:23 PDT 2013


Also, if you want to get a list of all the kmers and their counts,
here is a script for your reference.

import khmer
# here k=5
def get_count(filename):

        ht = khmer.new_counting_hash(5, 1000000, 2)
        ht.consume_fasta(filename)

        for i in range(4**5):
                if ht.get(i):
                        kmer = khmer.reverse_hash(i,5)
                        count = ht.get(i)
                        print kmer,count

file = 'test.fa'
get_count(file)

Regards,
Qingpeng


On Thu, Jun 20, 2013 at 4:10 PM, Qingpeng Zhang <qingpeng at gmail.com> wrote:
> Hi Rajat,
> As in the script Titus wrote[1], you can still use method get() to
> retrieve the count of a kmer(as in ktable)
>
> ht = khmer.new_counting_hash(K, HT_SIZE, N_HT)
> ht.consume_fasta(filename)
> count = ht.get('AACT')
>
>
> [1]  http://lists.idyll.org/pipermail/khmer/2013-June/000104.html
>
> Regards,
> Qingpeng
>
>
>
> On Thu, Jun 20, 2013 at 2:25 PM, Rajat <rajatshuvro at gmail.com> wrote:
>> Thanks. I got it running with load-into-counting.py. However, I want to
>> print out the k-mers with frequencies in a file. I don't see any example for
>> doing that in http://khmer.readthedocs.org/en/latest/scripts.html. Can
>> anyone help?
>>
>> Thanks in advance.
>> Rajat
>>
>>
>> On Thu, Jun 20, 2013 at 9:27 AM, C. Titus Brown <ctb at msu.edu> wrote:
>>>
>>> On Thu, Jun 20, 2013 at 08:31:52AM -0400, Rajat wrote:
>>> > Hi Prof ,
>>> > I tried to follow the example at
>>> > http://khmer.readthedocs.org/en/latest/ktable.html to extract kmers
>>> > using
>>> > khmer. It works fine for small values of k (<=15) but exits without any
>>> > output as soon as I make k>=16. In
>>> > http://khmer.readthedocs.org/en/latest/ktable.html they mention that it
>>> > does not work for k>12 on their machines. However, it is mentioned in
>>> > http://khmer.readthedocs.org/en/latest/introduction.html that khmer can
>>> > count kmers for k<=32. Could you point me to some examples that I can
>>> > follow to count 31 mers with khmer?
>>>
>>> Hi Rajat,
>>>
>>> please see this recent thread on the khmer mailing list:
>>>
>>> http://lists.idyll.org/pipermail/khmer/2013-June/thread.html#start
>>>
>>> Briefly, ktable does the "dumb thing" and allocates 4**k bytes of memory,
>>> which only works up to ~k=15 on many computers.  The counting hash data
>>> structure will count for much larger k.
>>>
>>> You can also take a look at this IPython Notebook:
>>>
>>>
>>> http://nbviewer.ipython.org/urls/raw.github.com/ngs-docs/ngs-notebooks/master/ngs-5x-kmer-abundance-distributions-2013.ipynb
>>>
>>> for information on getting k-mer abundance distributions.
>>>
>>> best,
>>> --titus
>>> --
>>> C. Titus Brown, ctb at msu.edu
>>
>>
>>
>> _______________________________________________
>> khmer mailing list
>> khmer at lists.idyll.org
>> http://lists.idyll.org/listinfo/khmer
>>




More information about the khmer mailing list