[khmer] Extracting the original reads after diginorm + partitioning

Adina Chuang Howe adina.chuang at gmail.com
Thu Mar 7 11:10:17 PST 2013


Possible bug in sweep-reads...I'm not recovering the partitioned reads
from the original dataset.

First observed this when I looked at lotsa partitions and trying to
recover swept reads - some swept files would show up empty:

command:
python sweep-reads3.py -N 4 -k 32 -x 1e9
/mnt/research/gpgc/hmp-mock-partitions/001264-files/no-sweep-pids/pid*fa
/mnt/research/gpgc/hmp-mock-partitions/SRR-combined.fastq

troubleshooting:
Then I looked at just one partition:
on HPC:  /mnt/scratch/howead/test
python sweep-reads3.py pid-42391.fa SRR-combined.fastq

And resulting sweepfile is empty.

If I run:
python sweep-reads3.py pid-42391.fa pid-42391.fa

Behavior is correct.

Advice?
Thanks,
Adina




More information about the khmer mailing list