[khmer] questions about khmer utilities
C. Titus Brown
ctb at msu.edu
Mon Aug 19 18:24:57 PDT 2013
On Tue, Aug 20, 2013 at 12:50:29AM +0000, Kenlee Nakasugi wrote:
> I'm just starting to do some kmer-type analyses on my read data.
> I am trying to compare/find any differences in two different read datasets, and am trying to get some basic k-mer stats, but more importantly trying to figure out what would be the best type of comparison to do.
>
> I have already generated the hash (.kh) and hist (.hist) via the load-into-counting.py and abundance-dist.py scripts from khmer v0.4.
> Now from http://khmer.readthedocs.org/en/latest/blog-posts.html , i wanted to get the abundance by position, and hi-lo kmer distributions, but are the scripts listed there (and found in sandbox directory in khmer distribution) compatible with the output from 'load-into-counting.py and abundance-dist.py' ?
>
> I tried inputting the .hist and hash tables but getting errors like:
>
> ##
> python /usr/local/bin/khmer/sandbox/abundance-hist-by-position.py R1.k32.hist
> ... 0
> Traceback (most recent call last):
> File "/usr/local/bin/khmer/sandbox/abundance-hist-by-position.py", line 15, in <module>
> countSum[i] += int(tok[i])
> ValueError: invalid literal for int() with base 10: '0.617'
> ##
>
> Do I need to run these scripts from http://khmer.readthedocs.org/en/latest/blog-posts.html from scratch ?
> And any other types of comparisions useful?
Hi Ken,
yep, those scripts appear to be out of date with respect to the instructions;
thanks for pointing this out!
If you want to get abundance-1 and abundance-255 k-mers by position,
you can use
python sandbox/hi-lo-abundance-by-position.py 25k data/25k.fq.gz
and you'll see the output in .pos.abund=1 and .pos.abund=255 files.
To use the by-position scripts, here is a short guide with the sequences
in 25k.fq.gz:
python scripts/load-into-counting.py -k 20 25k data/25k.fq.gz
python sandbox/fasta-to-abundance-hist.py 25k data/25k.fq.gz
python sandbox/abundance-hist-by-position.py data/25k.fq.gz.freq > out.dist
We'll amend the documentation - thanks again for letting us know about this!
HTH,
--titus
--
C. Titus Brown, ctb at msu.edu
More information about the khmer
mailing list