[khmer] zero abundance kmers
Aaron Liston
listona at science.oregonstate.edu
Thu Jan 30 10:18:34 PST 2014
Hi Titus: The raw data do have 12427 reads with Ns, so that makes sense. I
did create a new counting hash for each read set, and the subsequent files
do not have Ns. However, there still are some zero abundance counts: after
trimming and adapter removal (854), after normalize-by-median(2), after
filter-abund (48). I can send you a link to the read files, if you would
like to take a look.
Thanks, Aaron
-----Original Message-----
From: C. Titus Brown [mailto:ctb at msu.edu]
Sent: Thursday, January 30, 2014 6:19 AM
To: Aaron Liston
Cc: khmer at lists.idyll.org
Subject: Re: [khmer] zero abundance kmers
On Wed, Jan 29, 2014 at 09:15:24PM -0800, Aaron Liston wrote:
> I am using load-into-counting.py and abundance-dist.py to plot the
> kmer abundance distribution of read data at various stages in the
> digital normalization process. Some of the plots have a small
> fraction (<0.001) of kmers with zero abundance. Is this an error, or
> does it mean something?
Hi Aaron,
that should only happen if you're either missing some reads (e.g. if you
load data set A into a counting hash, and then count all the k-mers in data
set B) _or_ if you have Ns in your reads somewhere.
Gang, we should verify this and put it in a FAQ... thanks for pointing it
out!
cheers,
--titus
--
C. Titus Brown, ctb at msu.edu
More information about the khmer
mailing list