<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
Hi,<br>
<br>
We want to analyse 454 metagenomics data (570 000 reads of ~700 nt
per sample).<br>
My questions are :<br>
1/ Given that khmer is rather short-read/Illumina oriented, are we
mistaken to try and apply it to our long 454 reads?<br>
2/ Is there an actual benefit in feeding .fastq files to khmer (in
our case separate .qual files at the moment, but that can be
changed), or does it really only consider the sequence data? ie. are
the fasta files sufficient ?<br>
How do you define what data needs pre-normalization or what data can
go straight to artifact removal? In the Iowa corn example, you do
not start by a normalize/filter pass, how come?<br>
3/ Thus, should the pipeline for our data be like :<br>
DIGINORM (normalize-by-median -- filter-abund --
normalize-by-median) -- ARTIFACT REMOVAL (load-graph --
partition-graph, ...etc)<br>
or<br>
is the step DIGINORM useless in our case ?<br>
<br>
Thanks for your help<br>
<br>
And thanks for your great job on khmer 1.0 !<br>
<br>
Cheers From Bordeaux<br>
<br>
Alexis<br>
<div class="moz-signature">-- <br>
<a href="http://www.u-bordeaux.fr" title="Université de Bordeaux"><img
style="border: 0;"
src="cid:part1.00000602.02060705@u-bordeaux.fr" alt="CBiB -
Université de Bordeaux"></a>
<p style="font-family: Arial, sans-serif; font-size: 13px; color:
#009DE0;line-height:17px"><span style="font-weight: bold"><a
href="mailto:alexis.groppi@u-bordeaux2.fr"
style="color:#009DE0; text-decoration:none;">Dr Alexis <span
style="text-transform:uppercase;">Groppi</span></a></span><br>
Directeur adjoint du CBiB - Chargé de mission du CGFB<br>
146, rue Léo Saignat - Case 68 - 33076 Bordeaux Cedex<br>
T. +33 5 57 57 12 18<br>
P. +33 6 35 95 04 87<br>
<a href="http://www.cbib.u-bordeaux2.fr" style="color:#009DE0;
text-decoration:none;">www.cbib.u-bordeaux2.fr</a><br>
</p>
</div>
</body>
</html>