[khmer] khmer stripped header information from RNA-seq reads, rendering them unusable

Ramakrishnan Srinivasan ramrs at nyu.edu
Thu Jul 17 11:53:07 PDT 2014


I was just curious - maybe the part that actually has the paired end info
is stripped because it is not seen as part of the header owing to the white
space. If re-running is feasible, could we maybe try with blank spaces
substituted with a placeholder (such as a double underscore)?

--
Ram


On Thu, Jul 17, 2014 at 2:36 PM, Erich Marquard Schwarz <ems394 at cornell.edu>
wrote:

> On Jul 17, 2014, at 2:25 PM, Philipp Schiffer <philipp.schiffer at gmail.com>
> wrote:
>
> > you can indeed do whatever you like there. However, as I tried to
> indicate, it might really make sense to go with -1, -2 or /1, /2.
> > My guess is that a lot of scripts could struggle with the "#" you are
> using.
>
>     Any script that can handle older-format Illumina reads will do fine,
> which is all of mine, along with many standard programs (e.g., older bowtie
> works fine with the older format, because that older format was the
> standard when bowtie was first designed!).  So, you may be right in many
> cases, but in this case I don't really need to worry about using the older
> Illumina format.  As with all things Unix, it is a matter of *which* pesky
> details are going to be lethal in a given context.
>
>
> > Meanwhile it is also possible to just "repair" the reads in the .keep
> file by comparison with the raw reads file where headers have been fixed.
> Might save some time....
>
>     Ugh.  I recognize that you are correct, and in some circumstances I
> would do that, but I think trying to fix a munged file is inherently more
> error-prone than just making a file that will be bullet-proof.  Again,
> you're certainly not wrong, but it's a question of what particular gotchas
> one is trying to steer clear of.
>
>     I recently told a bioinformatics class in Yerevan, Armenia: "A great
> deal of bioinformatics consists of converting data from one file format to
> another, rather than actually doing computations on the data."  Sad, but
> true.
>
>
> > Good luck
>
>     Thanks!
>
>
> --Erich
>
>
>
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140717/bb152211/attachment.htm>


More information about the khmer mailing list