[khmer] questions about khmer utilities

Mon Aug 19 17:50:29 PDT 2013

Hi,

I'm just starting to do some kmer-type analyses on my read data. 
I am trying to compare/find any differences in two different read datasets, and am trying to get some basic k-mer stats, but more importantly trying to figure out what would be the best type of comparison to do.

I have already generated the hash (.kh) and hist (.hist) via the load-into-counting.py and abundance-dist.py scripts from khmer v0.4. 
Now from http://khmer.readthedocs.org/en/latest/blog-posts.html , i wanted to get the abundance by position, and hi-lo kmer distributions, but are the scripts listed there (and found in sandbox directory in khmer distribution) compatible with the output from 'load-into-counting.py and abundance-dist.py' ? 

I tried inputting the .hist and hash tables but getting errors like:

##
python /usr/local/bin/khmer/sandbox/abundance-hist-by-position.py R1.k32.hist 
... 0
Traceback (most recent call last):
  File "/usr/local/bin/khmer/sandbox/abundance-hist-by-position.py", line 15, in <module>
    countSum[i] += int(tok[i])
ValueError: invalid literal for int() with base 10: '0.617'
##

Do I need to run these scripts from http://khmer.readthedocs.org/en/latest/blog-posts.html from scratch ?
And any other types of comparisions useful? 

Any advice appreciated!

Cheers,
Ken