[khmer] khmer on 454 metagenomics data
Alexis Groppi
alexis.groppi at u-bordeaux.fr
Fri Apr 4 07:49:42 PDT 2014
Hi,
We want to analyse 454 metagenomics data (570 000 reads of ~700 nt per
sample).
My questions are :
1/ Given that khmer is rather short-read/Illumina oriented, are we
mistaken to try and apply it to our long 454 reads?
2/ Is there an actual benefit in feeding .fastq files to khmer (in our
case separate .qual files at the moment, but that can be changed), or
does it really only consider the sequence data? ie. are the fasta files
sufficient ?
How do you define what data needs pre-normalization or what data can go
straight to artifact removal? In the Iowa corn example, you do not start
by a normalize/filter pass, how come?
3/ Thus, should the pipeline for our data be like :
DIGINORM (normalize-by-median -- filter-abund -- normalize-by-median)
-- ARTIFACT REMOVAL (load-graph -- partition-graph, ...etc)
or
is the step DIGINORM useless in our case ?
Thanks for your help
And thanks for your great job on khmer 1.0 !
Cheers From Bordeaux
Alexis
--
CBiB - Université de Bordeaux <http://www.u-bordeaux.fr>
Dr Alexis Groppi <mailto:alexis.groppi at u-bordeaux2.fr>
Directeur adjoint du CBiB - Chargé de mission du CGFB
146, rue Léo Saignat - Case 68 - 33076 Bordeaux Cedex
T. +33 5 57 57 12 18
P. +33 6 35 95 04 87
www.cbib.u-bordeaux2.fr <http://www.cbib.u-bordeaux2.fr>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140404/dfa0c2da/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature_mail_cbib_nub.jpg
Type: image/jpeg
Size: 47582 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140404/dfa0c2da/attachment-0001.jpg>
More information about the khmer
mailing list