[khmer] partitioning pipeline output, fastq

Fri May 24 06:06:01 PDT 2013

Correct!

---
C. Titus Brown, ctb at msu.edu

On May 24, 2013, at 8:59, Jens-Konrad Preem <jpreem at ut.ee> wrote:

> Hi,
> similar to my question about filter-below-abund.by output that got already solved. Thanks!
> The input and output for partitioning pipeline as mentioned by your Guide, and example of partitioning large data on your website is fasta formatted file. The next step for partitioned data would be assembly. I am thinking on pre-assembling the mate pairs with FLASH *before full assembly with SoapDenovo2 or Velvet. The input files for FLASH are fastq.
> 
> Do I understand correctly that nothing happens to the sequences themselves during the partitioning- they are just binned/sorted around into groups/partitions?
> In such case it should be no problem for me to take the quality scores from the filter-below-abund.py output fastq (the brother of filter-below-abundpy fasta output :D) and then just attach those to the partitioned sequences?
> 
> Jens
> 
> 
> 
> * they seem to apply that the genome assembly furhter down the line would be remarkably improved, at least as it is for the case of Soapdenovo, maybe it is not such a case for Velvet, the assembler you have suggested?
> Magoč, T., & Salzberg, S. L. (2011). FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics (Oxford, England), 27(21), 2957–63. doi:10.1093/bioinformatics/btr507
> 
> -- 
> Jens-Konrad Preem, MSc., University of Tartu
> 
> 
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org
> http://lists.idyll.org/listinfo/khmer