[khmer] Using khmer for producing k-mer frequency distribution
Rajat Shuvro Roy
rajatroy at cs.rutgers.edu
Tue Aug 27 14:27:34 PDT 2013
Thanks so much. I downloaded and compiled the latest version. make test
resulted in 'ok' for everything. However, when I tried to run it, I get the
python load-into-counting.py -k 31 -x 5e10 out.kh 1Mreads.fa
Traceback (most recent call last):
File "load-into-counting.py", line 13, in <module>
from khmer.counting_args import build_construct_args, report_on_config
ImportError: cannot import name report_on_config
On Tue, Aug 27, 2013 at 4:41 PM, C. Titus Brown <ctb at msu.edu> wrote:
> Hi Rajat,
> sorry for long delay in response!
> On Thu, Jul 18, 2013 at 03:32:39PM -0400, Rajat Shuvro Roy wrote:
> > Hello Prof Brown,
> > I was attempting to produce a k-mer frequency distribution using khmer
> > followed the instructions in (
> > http://khmer.readthedocs.org/en/latest/scripts.html) . I have a Zia mays
> > library (SRR404240, 95.8Gbp ) and I executed the following command.
> > python load-into-counting.py -k 31 -x 5e10 out.kh SRR404240.fasta
> > I believe, this counts k-mer frequencies and the script abundance-dist.py
> > produces the distribution.
> > We stopped it after it had ran for 2464 mins (41hrs) using 187GB space. I
> > tried with smaller values for -x but failed to complete the computation
> > less than 3 days. Could you please let us know if this is expected and we
> > should allow more time. And is there a more efficient way of using Khmer?
> Your e-mail actually triggered some doc changes and updates ;).
> Briefly, khmer can count k-mers in either constant-memory mode or in
> accurate-large-counts mode. In the former, counts above 255 will
> stop being counted, but the memory specified with the -N and -x parameters
> will be the total amount used; in the latter mode (which is the default),
> counts above 255 will be kept and memory use will expand indefinitely.
> You can use these modes easily in the latest khmer, the bleeding-edge
> branch; you can get that like so:
> git clone https://github.com/ged-lab/khmer.git -b bleeding-edge
> Then use 'load-into-counting.py -b' to build the tables, and
> to generate the output.
> I'd suggest running it on a small test data set (data/25k.fq.gz, in the
> khmer repo) just to make sure it all works for you, but it should - we use
> this regularly.
> Please let me know if you have any questions, and again, apologies for
> the delay!
> C. Titus Brown, ctb at msu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the khmer