<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<font face="Calibri">Hi Raf, hi Titus,<br>
<br>
Yesterday, I got this error too with another public dataset after
running another trimming tool (i.e. trim_galore).<br>
You probably get after the trimming process, some reads with a
length lower than the kmer size used to normalize.<br>
The 'normalize-by-median.py' seems to ignore those reads and
because the pair is not "found", it raises an error of unpaired
reads.<br>
In my case, removing reads shorter than the </font><font
face="Calibri"><font face="Calibri">kmer size </font> solved the
problem.<br>
</font>So it's rather a misleading error message than a bug.<br>
<br>
Regards, <br>
<br>
Cedric<br>
<br>
<blockquote type="cite">
<pre>Hi Raf,
this sounds like a bug of some sort, but no clear idea of what's
going on, sorry! I should be able to take a look at this file this
weekend.
thx,
--titus
On Fri, Oct 30, 2015 at 03:37:25PM +0100, Raf Winand wrote:
><i> Hi
</i>><i>
</i>><i> I'm trying out some of the examples I found on the internet and am now
</i>><i> working on part of the data that comes with walk-through called Kalamazoo
</i>><i> Metagenome Assembly protocol. The data set I'm currently trying out is
</i>><i> SRR492065. After running trimmomatic I end up with two PE files that I
</i>><i> interleave using the script 'interleave-reads.py'. When I run
</i>><i> 'normalize-by-median.py' with the --paired option on this interleaved file,
</i>><i> it gives an error (output below). I also used the script
</i>><i> 'extract-paired-reads.py' on the interleaved file and when it finishes it
</i>><i> says "DONE; read 10264272 sequences, 5132136 pairs and 0 singletons" so the
</i>><i> original file was probably fine. Running the normalization again on the
</i>><i> output of 'extract-paired-reads.py' gives the exact same error as before.
</i>><i>
</i>><i> Do you have any idea what might be causing this?
</i>><i>
</i>><i> Best regards
</i>><i> Raf
</i>><i>
</i>><i>
</i>><i> || This is the script normalize-by-median.py in khmer.
</i>><i> || You are running khmer version 2.0+36.g799039f
</i>><i> || You are also using screed version 0.9
</i>><i> ||
</i>><i> || If you use this script in a publication, please cite EACH of the
</i>><i> following:
</i>><i> ||
</i>><i> || * MR Crusoe et al., 2015.
</i>><i> <a href="http://dx.doi.org/10.12688/f1000research.6924.1">http://dx.doi.org/10.12688/f1000research.6924.1</a>
</i>><i> || * CT Brown et al., arXiv:1203.4802 [q-bio.GN]
</i>><i> ||
</i>><i> || Please see <a href="http://khmer.readthedocs.org/en/latest/citations.html">http://khmer.readthedocs.org/en/latest/citations.html</a> for
</i>><i> details.
</i>><i>
</i>><i>
</i>><i> PARAMETERS:
</i>><i> - kmer size = 20 (-k)
</i>><i> - n tables = 4 (-N)
</i>><i> - max tablesize = 8e+09 (-x)
</i>><i>
</i>><i> Estimated memory usage is 3.2e+10 bytes (n_tables x max_tablesize)
</i>><i> --------
</i>><i> making countgraph
</i>><i> ... kept 100000 of 100000 or 100.0% sofar
</i>><i> ... in file SRR492065_trim_combined.fastq.pe
</i>><i> ... kept 199984 of 200000 or 100.0% sofar
</i>><i> ... in file SRR492065_trim_combined.fastq.pe
</i>><i> ... kept 299832 of 300000 or 99.9% sofar
</i>><i> ... in file SRR492065_trim_combined.fastq.pe
</i>><i> ... kept 399356 of 400000 or 99.8% sofar
</i>><i> ... in file SRR492065_trim_combined.fastq.pe
</i>><i> ** ERROR: Unpaired reads when require_paired is set!
</i>><i> ** Failed on SRR492065_trim_combined.fastq.pe:
</i>><i> ** Exiting!
</i>><i>
</i>><i>
</i>><i>
</i>><i>
</i>><i> --
</i>><i> Raf Winand
</i>><i> PhD student
</i>><i> Faculty of Engineering - ESAT/STADIUS
</i>><i> Bioinformatics Group
</i>><i> Kasteelpark Arenberg 10 bus 2446
</i>><i> 3001 Heverlee
</i>><i> BELGIUM
</i>><i> Tel: +32 16 32 86 43
</i>
><i> _______________________________________________
</i>><i> khmer mailing list
</i>><i> <a href="http://lists.idyll.org/listinfo/khmer">khmer at lists.idyll.org</a>
</i>><i> <a href="http://lists.idyll.org/listinfo/khmer">http://lists.idyll.org/listinfo/khmer</a>
</i>
--
C. Titus Brown, <a href="http://lists.idyll.org/listinfo/khmer">ctbrown at ucdavis.edu</a>
</pre>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
-----------------------------------------------------------------
Cédric Cabau
INRA | SIGENAE | GenPhySE
CS 52627 - 31326 Castanet-Tolosan cedex FRANCE
Tel: +33(0)5.61.28.54.60 - Fax: +33(0)5.61.28.53.08
<a class="moz-txt-link-freetext" href="http://www.sigenae.org/">http://www.sigenae.org/</a>
-----------------------------------------------------------------
</pre>
</body>
</html>