[ged] Paper for next week

Jason Pell jason.pell at gmail.com
Wed Nov 3 11:51:03 PDT 2010


Hi everyone,

Next week, I will be presenting the paper "Optimization of de novo
transcriptome assembly from next-generation sequencing data."  The
abstract and link to the paper is below.  See you then!

Jason

Abstract:
Transcriptome analysis has important applications in many biological
fields. However, assembling a transcriptome without a known reference
remains a challenging task requiring algorithmic improvements. We
present two methods for substantially improving transcriptome de novo
assembly. The first method relies on the observation that the use of a
single k-mer length by current de novo assemblers is suboptimal to
assemble transcriptomes where the sequence coverage of transcripts is
highly heterogeneous. We present the Multiple-k method in which
various k-mer lengths are used for de novo transcriptome assembly. We
demonstrate its good performance by assembling de novo a published
next-generation transcriptome sequence data set of Aedes aegypti,
using the existing genome to check the accuracy of our method. The
second method relies on the use of a reference proteome to improve the
de novo assembly. We developed the Scaffolding using Translation
Mapping (STM) method that uses mapping against the closest available
reference proteome for scaffolding contigs that map onto the same
protein. In a controlled experiment using simulated data, we show that
the STM method considerably improves the assembly, with few errors. We
applied these two methods to assemble the transcriptome of the
non-model catfish Loricaria gr. cataphracta. Using the Multiple-k and
STM methods, the assembly increases in contiguity and in gene
identification, showing that our methods clearly improve quality and
can be widely used. The new methods were used to assemble successfully
the transcripts of the core set of genes regulating tooth development
in vertebrates, while classic de novo assembly failed.

Link: http://genome.cshlp.org/content/20/10/1432.full



More information about the ged-jclub mailing list