[khmer] is_pair sensitive to order

Keith Robison keith.e.robison at gmail.com
Wed Dec 30 07:09:32 PST 2015


I've never had a problem before, but then again I've either (a) explicitly
made sure R1 & R2 were in that order or (b) sorted the files (which usually
puts them in the right order)

In a script I wrote over the holidays, I forgot to do this & half the runs
it worked & half it failed. And then I substituted
'/path/to/fastq/*.fastq.gz' and they all worked -- which had me scratching
my head until this morning when it hit me I hadn't sorted them before!

So, my trying to make you do some work & take a performance hit to prevent
something simply sorting the list of input files would accomplish! :-)

More seriously, perhaps interleave-reads.py could do the swap automatically

On Wed, Dec 30, 2015 at 10:01 AM, C. Titus Brown <ctbrown at ucdavis.edu>
wrote:

> On Wed, Dec 30, 2015 at 10:00:49AM -0500, Keith Robison wrote:
> > Probably few people would see this as a bug, but is_pair in utils.py (and
> > by extension, interleave_pairs.py) is sensitive to the order the two
> > sequences are given in  -- if you give sequences (FASTQ files for
> > interleave_pairs.py) in the order R2, R1, the function will say they are
> > not paired
>
> It certainly is! Are there programs that accept R2 before R1?
>
> best,
> --titus
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20151230/b033d270/attachment.htm>


More information about the khmer mailing list