[ged] GED journal club reminder
Likit Preeyanon
preeyano at msu.edu
Tue Feb 8 14:57:15 PST 2011
QP will be talking about Jellyfish tomorrow at 12p.
A fast, lock-free approach for efficient parallel counting of occurrences of k-mers
Guillaume Marçais1,* and Carl Kingsford
Counting the number of occurrences of every k-mer (substring of length k) in a long string is a central subproblem in many applications, including genome assembly, error correction of sequencing reads, fast multiple sequence alignment, and repeat detection. Recently, the deep sequence coverage generated by next-generation sequencing technologies has caused the amount of sequence to be processed during a genome project to grow rapidly, and has rendered current k-mer counting tools too slow and memory intensive. At the same time, large multi-core computers have become commonplace in research facilities allowing for a new parallel computational paradigm.
We propose a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient. It is based on a multi-threaded, lock-free hash table optimized for counting k-mers up to 31 bases in length. Due to their flexibility, suffix arrays have been the data structure of choice for solving many string problems. For the task ofk-mer counting, important in many biological applications, Jellyfish offers a much faster and more memory efficient solution.
http://bioinformatics.oxfordjournals.org/content/early/2011/01/07/bioinformatics.btr011.abstract
See you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/ged-jclub/attachments/20110208/bf84414d/attachment.htm>
More information about the ged-jclub
mailing list