[khmer] InvalidFASTQFileFormat

Michael R. Crusoe mcrusoe at msu.edu
Thu Jan 9 12:10:05 PST 2014


Hello Dr. Wang,

As I don't have your complete sequence file I am not able to reproduce your
error. However it does appear to be a known bug. We are tracking this in
https://github.com/ged-lab/khmer/issues/249

As a workaround you can disable threading with "-T 1" when you run into
this error.

My apologies for this.


On Wed, Jan 8, 2014 at 11:15 AM, Shaolin Wang
<sw4ed at eservices.virginia.edu>wrote:

> I am new, please help, I got InvalidFASTQFileFormat error, the sequence
> looks fine, even after I remove that sequence, it still shows the error,
> and the error always show on line 16181.
>
> [17:27:40]sw4ed at hpcserver:~/Meta/SoilMeta> load-into-counting.py -T 4 -k
> 20  -N 4 -x 4e9 Sample_2024-1F_1.kh Sample_2024-1F_1.trimmed.fastq
>
> PARAMETERS:
>  - kmer size =    20            (-k)
>  - n hashes =     4             (-N)
>  - min hashsize = 4e+09         (-x)
>
> Estimated memory usage is 1.6e+10 bytes (n_hashes x min_hashsize)
> --------
> Saving hashtable to Sample_2024-1F_1.kh
> Loading kmers from sequences in ['Sample_2024-1F_1.trimmed.fastq']
> making hashtable
> consuming input Sample_2024-1F_1.trimmed.fastq
> terminate called after throwing an instance of
> 'khmer::read_parsers::InvalidFASTQFileFormat'
>   what():  InvalidFASTQFileFormat: illegal sequence letters:
> @HISEQ700708:147:D278GACXX:2:1101:6816:3815 2:N:0:ATCACG
> Aborted (core dumped)
>
> [17:44:12]sw4ed at hpcserver:~/Meta/SoilMeta> sed -n '16181,16188p'
> Sample_2024-1F_1.trimmed.fastq
> @HISEQ700708:147:D278GACXX:2:1101:6816:3815 2:N:0:ATCACG
>
> ATTGTCTGCGCGTTACGATATTATCAAGAATCGCGACTGGTTATGGTCTCTTACTGCTACAACACTGAACACTAAGACCAAATATGCTAACATTGGCAAC
> +
>
> CCCFFFFFHHHHHIJJJJIJJJJJJJJJJJJJJJJJJJJJIJIIIJIHHHHHFFFFFFFFEEEDDDDDDDDDDCDDDDDDDDDDEEDEDDDDDDDDDDDD
> @HISEQ700708:147:D278GACXX:2:1101:6919:3932 1:N:0:ATCACG
>
> TATTCAGGAAAACCTGCCGCAGACGCTTGGGGTCGCAGGAGATTTCCGGAATGTCTTCGTCATTGTCGATATAGTCCAGTCGAATACCGTCCTGGGCCAG
> +
> @C at FFFDFFHGFHIGIIJIIIJJJJGIJIJJIFEFHIIGGDHFHHHHFF
> <ACAEDDCDD=?A?DD at CBDDBDDCC@CDCDD at B?C at CCB<@DD>C?BDDB
>
>
> --
> Shaolin Wang, Ph.D
> Research Scientist
> Department of Psychiatry & Neurobiology Science
> University of Virginia
> 1670 Discovery Drive, Suite 110
> Charlottesville, VA 22911
>  Phone: 434-982-0243
> Fax:434-973-7031
> E-mail: swang at virginia.edu <sw4ed at eservices.virginia.edu>
>
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer
>
>


-- 
Michael R. Crusoe: Software Engineer and Bioinformatician  mcrusoe at msu.edu
 @ the Genomics, Evolution, and Development lab; Michigan State University
http://ged.msu.edu/     http://orcid.org/0000-0002-2961-9670
@biocrusoe<http://twitter.com/biocrusoe>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20140109/908e6378/attachment-0001.htm>


More information about the khmer mailing list