[khmer] zero abundance kmers

Aaron Liston listona at science.oregonstate.edu
Thu Jan 30 10:18:34 PST 2014


Hi Titus:  The raw data do have 12427 reads with Ns, so that makes sense. I
did create a new counting hash for each read set, and the subsequent files
do not have Ns.  However, there still are some zero abundance counts: after
trimming and adapter removal (854), after normalize-by-median(2), after
filter-abund (48).  I can send you a link to the read files, if you would
like to take a look.   
Thanks, Aaron

-----Original Message-----
From: C. Titus Brown [mailto:ctb at msu.edu] 
Sent: Thursday, January 30, 2014 6:19 AM
To: Aaron Liston
Cc: khmer at lists.idyll.org
Subject: Re: [khmer] zero abundance kmers

On Wed, Jan 29, 2014 at 09:15:24PM -0800, Aaron Liston wrote:
> I am using load-into-counting.py and abundance-dist.py  to plot the 
> kmer abundance distribution of read data at various stages in the 
> digital normalization process.  Some of the plots have a small 
> fraction (<0.001) of kmers with zero abundance.  Is this an error, or 
> does it mean something?

Hi Aaron,

that should only happen if you're either missing some reads (e.g. if you
load data set A into a counting hash, and then count all the k-mers in data
set B) _or_ if you have Ns in your reads somewhere.

Gang, we should verify this and put it in a FAQ... thanks for pointing it
out!

cheers,
--titus
--
C. Titus Brown, ctb at msu.edu





More information about the khmer mailing list