[bip] BLAST and FASTA performance

Mon Jul 21 16:56:13 PDT 2008

On Tue, Jul 22, 2008 at 01:11:33AM +0200, Andrew Dalke wrote:
-> My technical optimism says that the algorithms should also be  
-> improving, so similarity search time should have gone down.  Given  
-> the number of people who have worked on sequence similarity  
-> algorithms, I'm surprised this hasn't happened.

Barring a clear reason, why would biologists switch away from a
successful approach?  Is SSEARCH significantly better than BLAST, or is
it just marginally better?

-> In thinking about this I have a suspicion that people use BLAST  
-> because people use BLAST.  Something I've seen in other places is  
-> that people like using widely-used tools because there's less need to  
-> justify the choice.  If you use some other tool then the first  
-> question from an article reviewer, or at a conference, will be "why  
-> did you use method-X instead of BLAST"?

I think the reviewer has a point, though; the appropriate answer to that
question would be noted in the article (or supplementary material) as
"We used XX because it proved to be more sensitive in YY and ZZ cases
than BLAST."

-> Another possibility is that people trust NCBI to be right, and don't  
-> trust other programs.  Or that BLAST code isn't written to be easily  
-> optimized.  Or that NCBI doesn't take in external patches.  There are  
-> too many vagaries for me to be sure about any of these.

And yes, there's all the translational work needed to get a program from
the "algorithm works!" stage to the "heck, this is rock solid, well
documented, and available in all of my important binary formats"
stage...

cheers,
--titus
-- 
C. Titus Brown, ctb at msu.edu