[khmer] Partitioning question
C. Titus Brown
ctb at msu.edu
Thu Aug 1 05:28:50 PDT 2013
[ redirecting discussion to the khmer at lists.idyll.org list -- see
http://lists.idyll.org/listinfo/khmer ]
On Wed, Jul 31, 2013 at 04:00:18PM -0700, Bill Nelson wrote:
> I have a relatively large dataset (~300M Illumina reads) from a relatively
> simple community (~18 organisms). I am trying to partition the data to see
> if my assemblies improve.
>
> When I run partition-graph.py, the resulting pmap files always have
> approximately the same size. Is that expected behavior when I know the
> organisms are present at very different abundances?
Hi bill,
short version: yes. The pmaps are actually created to evenly divide all
of the k-mers (well, it's a bit more complicated, but that's the basic idea)
so the number/size will correlate with overall diversity.
Also, for so few reads, I would expect digital normalization to give you good
results; partitioning is probably overkill (although there are other reasons
why it's not a bad idea).
cheers,
--titus
--
C. Titus Brown, ctb at msu.edu
More information about the khmer
mailing list