[khmer] khmer/khmer-protocols confusion

Francesco Rubino
Mon Feb 24 09:22:25 PST 2014

Hi Michael,

thanks for your reply. I wasn't sure about the version because I did find some documentation online about version 0.8.x and I got confused. Regarding the casava format, this is the page: http://khmer.readthedocs.org/en/latest/scripts.html?

In the section about read handling it mention the format @name/1 as the latest.

Should I change my files to be interleaved and with the headers in the old format (the one ending in /1 or /2) to make sure I

don't run into problems in some of khmer scripts?

I have code to interleave and detect the and convert the header format anyway, so it won't really be a problem, but I'd prefer not to if possible.


Michael R. Crusoe
24 February 2014 17:07
To: Francesco Rubino [frr11]
Cc: khmer at lists.idyll.org
Subject: Re: [khmer] khmer/khmer-protocols confusion

On Mon, Feb 24, 2014 at 10:58 AM, Francesco Rubino wrote:
Hello all,

Hello Francesco,

Thank you for using the khmer suite and for your questions.

I've read part of the documentation of khmer and I'm interested in using it to either partition or normalise a meta-transcriptome I have. I have a bit of confusion about the versions of the software, though. I have a few questions I hope you can answer:

1) I've read both about a khmer 0.7.x (http://khmer.readthedocs.org/en/latest/) and a 0.8.x (https://github.com/ctb/khmer/releases). Which should I use?

The official GitHub repository for khmer is https://github.com/ged-lab/khmer. The latest current release is v0.7.1. Our primary method for distribution is via the Python Package Index ( https://pypi.python.org/pypi/khmer ) using the `pip` command as documented in http://khmer.readthedocs.org/en/latest/

GitHub will present any branch or tag as a release. Titus's personal repository has a branch named "protocols-v0.8.3" in reference to the version of khmer-protocols not of the khmer project itself.

2) I see in the documentation that you need to interleave the fastq files if using paired-end data. Why do you refer as "@name/1" as the new casava format? I thought that one is the old one.

That would be our error. Supporting the new format is being tracked in this GitHub issue: https://github.com/ged-lab/khmer/issues/23

What document where you looking at?

Francesco Rubino

khmer mailing list
khmer at lists.idyll.org<mailto:khmer at lists.idyll.org>

Michael R. Crusoe:  Programmer & Bioinformatician
 @ the Genomics, Evolution, and Development lab; Michigan State U
http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670 @biocrusoe
