[bip] BLAST and FASTA performance
Bruce Southey
bsouthey at gmail.com
Mon Jul 21 06:53:05 PDT 2008
A_user wrote:
> Everybody is speaking about algorithms. Not only the algorithms matters but also the speed of our programming language!
>
> For example I have recently discovered a program called DNA Baser ( http://www.dnabaser.com/ ) which seems to be compiled, as it is only 1.5MB is size ( can be downloaded without registration ).
>
> They are using some very complex algorithm for sequence assembly and the accuracy of the results is amazing. This accuracy can’t be done using (fast) algorithms that are giving partial (poor) solutions. And still the program is incredible fast.
> So, if we don’t use a compiled language, at least we should spend some time to optimize our code for speed, not only to look for new algorithms because I have seen so many poor written Perl/Python programs….
>
> Now, I decided to learn more about compilers. The dual code processor is already old on the market and the quadruple core will be here soon also. Compiled code can really take advantage of these new CPU features.
>
>
>
>
>
> _______________________________________________
> biology-in-python mailing list - bip at lists.idyll.org.
>
> See http://bio.scipy.org/ for our Wiki.
Current compilers do support many of the features and even gcc can now
attempt to automatically generate parallel code (no guarantee it will be
better or correct). But Python and other dynamic languages may not
implement fully all the features due to thing like multiplatform support
and global locks.
You might want to first check out:
Cython (http://cython.org/) for C in Python
F2py (http://www.f2py.org/) for Fortran in Python - is/was part of NumPy
It is this is harder to suggest anything in terms of parallel computing
because of the complexity involved. If it is embarrassingly parallel
then you just need to use threads. You may be able to use high
performance math libraries like MKL (Intel's Math Kernel Library:
http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm).
Otherwise you will need to understand parallel computing and how to get
it into your code.
Regards
Bruce
PS: Intel's quadcore processors were available in mid 2006.
More information about the biology-in-python
mailing list