[protocols] Khmer produces many partitions

Connor Skennerton c.skennerton at gmail.com
Wed Mar 26 21:31:03 PDT 2014


Hey Khmer team,

I’ve been working through your khmer protocols using some metagenomic data that I have and I’ve just completed the part where you run ‘do-partition.py’ to generate the *.part files. The only thing is though that looking at the terminal output from this script it’s wanting to create literally millions of partitions! Have you guys ever seen this kind of thing before?

I should point out though that my data comes from deep sea sediment so it’s more complex than what this protocol was designed for. I found the following link on the khmer documentation (http://khmer.readthedocs.org/en/latest/partitioning-big-data.html) that talks about big datasets. Is this still the preferred workflow for big datasets or do you have some new tricks up your sleeve.

Thanks
Connor
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/protocols/attachments/20140326/88a83062/attachment.html>


More information about the protocols mailing list