[khmer] khmer/khmer-protocols confusion

Francesco Rubino [frr11] frr11 at aber.ac.uk
Tue Mar 4 04:02:27 PST 2014


Hi,

I have a few more questions, specifically about the abundance-dist.py. I used load load-into-counting.py to make an hash table, but why abundance-dist.py doesn't support multiple files?

If I run abundance-dist.py on a per-file base and concatenate the .hist files, will I get the same results, or should I concatenate all my sequence file into one to get the overall histogram for my dataset?

Thanks for your help,
Francesco

On 28 Feb 2014, at 20:43, Michael R. Crusoe <mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>> wrote:




On Mon, Feb 24, 2014 at 12:22 PM, Francesco Rubino [frr11] <frr11 at aber.ac.uk<mailto:frr11 at aber.ac.uk>> wrote:

Hi Michael,


thanks for your reply. I wasn't sure about the version because I did find some documentation online about version 0.8.x and I got confused. Regarding the casava format, this is the page: http://khmer.readthedocs.org/en/latest/scripts.html

In the section about read handling it mention the format @name/1 as the latest.

Thanks. I've updated this in the documentation for the newly released v0.8 of the khmer project.




Should I change my files to be interleaved and with the headers in the old format (the one ending in /1 or /2) to make sure I

don't run into problems in some of khmer scripts?

Correct. If you want your reads processed as pairs you need to use the /1,/2 format for now.



I have code to interleave and detect the and convert the header format anyway, so it won't really be a problem, but I'd prefer not to if possible.

Sorry, this is on our roadmap to fix this issue.


Thanks,
Francesco

________________________________
From: Michael R. Crusoe <mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>>
Sent: 24 February 2014 17:07
To: Francesco Rubino [frr11]
Cc: khmer at lists.idyll.org<mailto:khmer at lists.idyll.org>
Subject: Re: [khmer] khmer/khmer-protocols confusion



On Mon, Feb 24, 2014 at 10:58 AM, Francesco Rubino [frr11] <frr11 at aber.ac.uk<mailto:frr11 at aber.ac.uk>> wrote:
Hello all,

Hello Francesco,

Thank you for using the khmer suite and for your questions.

I've read part of the documentation of khmer and I'm interested in using it to either partition or normalise a meta-transcriptome I have. I have a bit of confusion about the versions of the software, though. I have a few questions I hope you can answer:

1) I've read both about a khmer 0.7.x (http://khmer.readthedocs.org/en/latest/) and a 0.8.x (https://github.com/ctb/khmer/releases). Which should I use?

The official GitHub repository for khmer is https://github.com/ged-lab/khmer. The latest current release is v0.7.1. Our primary method for distribution is via the Python Package Index ( https://pypi.python.org/pypi/khmer ) using the `pip` command as documented in http://khmer.readthedocs.org/en/latest/

GitHub will present any branch or tag as a release. Titus's personal repository has a branch named "protocols-v0.8.3" in reference to the version of khmer-protocols not of the khmer project itself.

2) I see in the documentation that you need to interleave the fastq files if using paired-end data. Why do you refer as "@name/1" as the new casava format? I thought that one is the old one.

That would be our error. Supporting the new format is being tracked in this GitHub issue: https://github.com/ged-lab/khmer/issues/23

What document where you looking at?


Thanks,
Francesco Rubino

_______________________________________________
khmer mailing list
khmer at lists.idyll.org<mailto:khmer at lists.idyll.org>
http://lists.idyll.org/listinfo/khmer




--
Michael R. Crusoe:  Programmer & Bioinformatician   mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>
 @ the Genomics, Evolution, and Development lab; Michigan State U
http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670 @biocrusoe<http://twitter.com/biocrusoe>



--
Michael R. Crusoe:  Programmer & Bioinformatician   mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>
 @ the Genomics, Evolution, and Development lab; Michigan State U
http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670 @biocrusoe<http://twitter.com/biocrusoe>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140304/f57c7c17/attachment-0001.htm>


More information about the khmer mailing list