[bip] BLAST and FASTA performance

Peter Clarke resurgo at gmail.com
Sat Jul 19 13:19:20 PDT 2008


the same people that implemented  fasta on gpu are working on blast
now. i don't have the reference now (i am texting from a cellphone)
but the paper should be linked from wikipedia. their website mentions
the blast work. also for gpu and python have a look at pystream (on
google code) and it's more harshly licensed successor from techx
corporation.

-Pete

On 7/19/08, C. Titus Brown <ctb at msu.edu> wrote:
> On Sat, Jul 19, 2008 at 01:22:12PM -0500, Bruce Southey wrote:
> -> On Fri, Jul 18, 2008 at 2:30 PM, Andrew Dalke <dalke at dalkescientific.com>
> wrote:
> -> > This is curiosity as I'm preparing my presentation for EuroSciPy
> -> > and updating myself on some of the bioinformatics parts.  I was
> -> > wondering how newer processors, newer instruction sets, and GPUs
> -> > have affected similarity searches.  How long does it take to search
> -> > all of present-day GenBank compared to the equivalent search
> -> > 5 years ago?  What's grown faster, database size or search
> -> > performance?
> ->
> -> As I recall, the basic algorithm is linear in the size of the
> -> database. But the score statistics (e-value) do change with database
> -> size (and query size). So really the database size is irrelevant from
> -> the performance aspect. The gains are more on how to handle the
> -> database better such as keeping the BLAST database in memory instead
> -> of restarting new every time (I think I showed that in some thread).
>
> Are you sure?  The algorithm itself shouldn't be linear in the size of
> the database, although some of the preparatory steps (constructing the
> hash table of words) are, of course.
>
> --titus
>
> _______________________________________________
> biology-in-python mailing list - bip at lists.idyll.org.
>
> See http://bio.scipy.org/ for our Wiki.
>


-- 
Saving the DNA of the world's endangered animals



More information about the biology-in-python mailing list