[ged] Paper for Wednesday

Jason Pell jason.pell at gmail.com
Fri Feb 25 19:53:14 PST 2011


Hi everyone,

On Wednesday, I will present the paper entitled "High-quality draft
assemblies of mammalian genomes from massively parallel sequence
data." This is the latest version of the ALLPATHS assembler, which is
designed to assemble mammalian-sized genomes. I will probably just
focus more on the evolution of ALLPATHS (to show how it works compared
to other assemblers) and then discuss some of the improvements they
have made to it in the most recent version.

http://www.pnas.org/content/early/2010/12/20/1017351108.abstract

Here is the abstract:

Massively parallel DNA sequencing technologies are revolutionizing
genomics by making it possible to generate billions of relatively
short (∼100-base) sequence reads at very low cost. Whereas such data
can be readily used for a wide range of biomedical applications, it
has proven difficult to use them to generate high-quality de novo
genome assemblies of large, repeat-rich vertebrate genomes. To date,
the genome assemblies generated from such data have fallen far short
of those obtained with the older (but much more expensive)
capillary-based sequencing approach. Here, we report the development
of an algorithm for genome assembly, ALLPATHS-LG, and its application
to massively parallel DNA sequence data from the human and mouse
genomes, generated on the Illumina platform. The resulting draft
genome assemblies have good accuracy, short-range contiguity,
long-range connectivity, and coverage of the genome. In particular,
the base accuracy is high (≥99.95%) and the scaffold sizes (N50 size =
11.5 Mb for human and 7.2 Mb for mouse) approach those obtained with
capillary-based sequencing. The combination of improved sequencing
technology and improved computational methods should now make it
possible to increase dramatically the de novo sequencing of large
genomes. The ALLPATHS-LG program is available at
http://www.broadinstitute.org/science/programs/genome-biology/crd.



More information about the ged-jclub mailing list