[khmer] Paired-end error

Cédric Cabau Cedric.Cabau at toulouse.inra.fr
Thu Nov 5 06:01:44 PST 2015


Hi Raf, hi Titus,

Yesterday, I got this error too with another public dataset after 
running another trimming tool (i.e. trim_galore).
You probably get after the trimming process, some reads with a length 
lower than the kmer size used to normalize.
The 'normalize-by-median.py' seems to ignore those reads and because the 
pair is not "found", it raises an error of unpaired reads.
In my case, removing reads shorter than the kmer size solved the problem.
So it's rather a misleading error message than a bug.

Regards,

Cedric

> Hi Raf,
>
> this sounds like a bug of some sort, but no clear idea of what's
> going on, sorry!  I should be able to take a look at this file this
> weekend.
>
> thx,
> --titus
>
> On Fri, Oct 30, 2015 at 03:37:25PM +0100, Raf Winand wrote:
> >/Hi />//>/I'm trying out some of the examples I found on the internet and am now />/working on part of the data that comes with walk-through called 
> Kalamazoo />/Metagenome Assembly protocol. The data set I'm currently trying out is />/SRR492065. After running trimmomatic I end up with two PE files that I />/interleave using the script 'interleave-reads.py'. When I run />/'normalize-by-median.py' with the --paired option on this interleaved 
> file, />/it gives an error (output below). I also used the script />/'extract-paired-reads.py' on the interleaved file and when it finishes 
> it />/says "DONE; read 10264272 sequences, 5132136 pairs and 0 singletons" 
> so the />/original file was probably fine. Running the normalization again on the />/output of 'extract-paired-reads.py' gives the exact same error as 
> before. />//>/Do you have any idea what might be causing this? />//>/Best regards />/Raf />//>//>/|| This is the script normalize-by-median.py in khmer. />/|| You are running khmer version 2.0+36.g799039f />/|| You are also using screed version 0.9 />/|| />/|| If you use this script in a publication, please cite EACH of the />/following: />/|| />/|| * MR Crusoe et al., 2015. />/http://dx.doi.org/10.12688/f1000research.6924.1 />/|| * CT Brown et al., arXiv:1203.4802 [q-bio.GN] />/|| />/|| Please see http://khmer.readthedocs.org/en/latest/citations.html for />/details. />//>//>/PARAMETERS: />/- kmer size = 20 (-k) />/- n tables = 4 (-N) />/- max tablesize = 8e+09 (-x) />//>/Estimated memory usage is 3.2e+10 bytes (n_tables x max_tablesize) />/-------- />/making countgraph />/... kept 100000 of 100000 or 100.0% sofar />/... in file SRR492065_trim_combined.fastq.pe />/... kept 199984 of 200000 or 100.0% sofar />/... in file SRR492065_trim_combined.fastq.pe />/... kept 299832 of 300000 or 99.9% sofar />/... in file SRR492065_trim_combined.fastq.pe />/... kept 399356 of 400000 or 99.8% sofar />/... in file SRR492065_trim_combined.fastq.pe />/** ERROR: Unpaired reads when require_paired is set! />/** Failed on SRR492065_trim_combined.fastq.pe: />/** Exiting! />//>//>//>//>/-- />/Raf Winand />/PhD student />/Faculty of Engineering - ESAT/STADIUS />/Bioinformatics Group />/Kasteelpark Arenberg 10 bus 2446 />/3001 Heverlee />/BELGIUM />/Tel: +32 16 32 86 43 /
> >/_______________________________________________ />/khmer mailing list />/khmer at lists.idyll.org <http://lists.idyll.org/listinfo/khmer> />/http://lists.idyll.org/listinfo/khmer /
>
> -- 
> C. Titus Brown,ctbrown at ucdavis.edu <http://lists.idyll.org/listinfo/khmer>

-- 

-----------------------------------------------------------------
Cédric Cabau
INRA | SIGENAE | GenPhySE
CS 52627 - 31326 Castanet-Tolosan cedex FRANCE
Tel: +33(0)5.61.28.54.60 - Fax: +33(0)5.61.28.53.08
http://www.sigenae.org/
-----------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20151105/b2a99a49/attachment.html>


More information about the khmer mailing list