[khmer] normalize-by-median.py Hanging

Daniel Standage daniel.standage at gmail.com
Wed Sep 17 13:27:09 PDT 2014


Before running norm-by-median, I

   - Downloaded SRA file
   - Used fastq-dump to create paired Fastq files
   - used interleave-reads to create a Fastq file in the One True Format

All of the Fastq files seem to be fine i.e. none appear truncated. Memory
usage is remaining constant, CPU utilization is 100%, but the weird thing
is that as far as I can tell the norm-by-median script is complete. It has
processed all the input, given a final report, and all of the kept reads
have been written to output: except the last read is missing and the second
to last read is cut off.


--
Daniel S. Standage
Ph.D. Candidate
Computational Genome Science Laboratory
Indiana University

On Wed, Sep 17, 2014 at 4:21 PM, C. Titus Brown <ctb at msu.edu> wrote:

> Hi Daniel,
>
> sounds like an infinite loop of some sort :(.
>
> A few questions --
>
> What version of khmer are you using?
>
> Have you run the reads file through any other software?  I'm worried
> that the file is truncated in some way.
>
> Do you know how far through your reads file it's gotten?
>
> Is memory usage increasing or remaining constant?
>
> thanks,
> --titus
>
> On Wed, Sep 17, 2014 at 04:16:37PM -0400, Daniel Standage wrote:
> > Hi all,
> >
> > I am seeing some strange behavior running normalize-by-median.py. The
> > program seemed to complete successfully after 30-45 minutes, but then it
> > just hung there. It's now been at least 90 minutes and it's continuing to
> > hang. The output file seems to contain all the data except the last
> record,
> > and the second-to-last record is cut off.
> >
> > (khmer-env)[standage at bggnomic qc] tail SRR494178_int.fastq.keep
> > +
> >
> GBGED>>E##################################################################################
> > @SRR494178.12090255/1
> >
> TCGAGGACNACCTTTTGACCCTTCTGCAACCTTTGAATTTCAGACATCAAACTCTCCCTCTGTCGTGTCTCCNNCAATGATGGGTCGGGC
> > +
> >
> IIIIIGGG#GGGGGGIIIIIIIIIIIIIIIIGIHIIIIIGIIIIIIIIIIIIIHIIIIHIEGHHIFIHII=?##?;9>>;IGBFFGBD8G
> > @SRR494178.12090255/2
> >
> GATTCCGTCACCGAGGAGTATCCGTTGCCGAGGTTGTGCGTCTGTCGAACCTGGCCGTTCTTTTTGACCGTGTAGGTGCCGCCGTTGATC
> > +
> > IIIIIIHIIIIIIIIIBIHHIIIGIIIIIII(khmer-env)[standage at bggnomic qc]
> >
> > Any ideas as to what could be causing this?
> >
> > Thanks,
> > Daniel
> >
> > PS.
> >
> >    - OS: Fedora 20 with lots o RAM (100s of GB)
> >    - Command: normalize-by-median.py -k 17 -p -N 4 -x 8e9
> >    SRR494178_int.fastq
> >    - Data: http://www.ncbi.nlm.nih.gov/sra/?term=SRR494178
> >
> >
> > --
> > Daniel S. Standage
> > Ph.D. Candidate
> > Computational Genome Science Laboratory
> > Indiana University
>
> > _______________________________________________
> > khmer mailing list
>
> --
> C. Titus Brown, ctb at msu.edu
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140917/1faee384/attachment.htm>


More information about the khmer mailing list