[ged] Journal Club next week

Qingpeng Zhang qingpeng at msu.edu
Tue Mar 6 17:25:19 PST 2012


I will be presenting the paper "Assessment of metagenomic assembly
using simulated next generation sequencing data." in next
week's journal club.

http://www.ncbi.nlm.nih.gov/pubmed/22384016

Assessment of metagenomic assembly using simulated next generation
sequencing data.
Mende DR, Waller AS, Sunagawa S, Järvelin AI, Chan MM, Arumugam M,
Raes J, Bork P.

European Molecular Biology Laboratory, Heidelberg, Germany.
Abstract
Due to the complexity of the protocols and a limited knowledge of the
nature of microbial communities, simulating metagenomic sequences
plays an important role in testing the performance of existing tools
and data analysis methods with metagenomic data. We developed
metagenomic read simulators with platform-specific (Sanger,
pyrosequencing, Illumina) base-error models, and simulated metagenomes
of differing community complexities. We first evaluated the effect of
rigorous quality control on Illumina data. Although quality filtering
removed a large proportion of the data, it greatly improved the
accuracy and contig lengths of resulting assemblies. We then compared
the quality-trimmed Illumina assemblies to those from Sanger and
pyrosequencing. For the simple community (10 genomes) all sequencing
technologies assembled a similar amount and accurately represented the
expected functional composition. For the more complex community (100
genomes) Illumina produced the best assemblies and more correctly
resembled the expected functional composition. For the most complex
community (400 genomes) there was very little assembly of reads from
any sequencing technology. However, due to the longer read length the
Sanger reads still represented the overall functional composition
reasonably well. We further examined the effect of scaffolding of
contigs using paired-end Illumina reads. It dramatically increased
contig lengths of the simple community and yielded minor improvements
to the more complex communities. Although the increase in contig
length was accompanied by increased chimericity, it resulted in more
complete genes and a better characterization of the functional
repertoire. The metagenomic simulators developed for this research are
freely available.


Sincerely yours,

Qingpeng Zhang
Ph.D. Student
GED Lab in 2228 BPS Building
Department of Computer Science and Engineering
Michigan State University
East Lansing, MI 48824



More information about the ged-jclub mailing list