[bip] fast sequence searching algorithm... in python

Sagar Damle sagar at caltech.edu
Tue Sep 25 19:30:07 PDT 2007


Hi all,
   I'm in need of a fast sequence matching algorithm for DNA/RNA/ 
protein sequences.  My query searches are relatively short (<100bp)  
and 'subject' sequence is on the order of 10kb.  At the moment, I'm  
using the span() function in python's regular expression module to  
return all matches:

matches  = [match.span() for match in re.finditer(re.escape(str 
(query)), str(subject), re.IGNORECASE )]

   This basically returns all coordinates of my query against subject  
as a list of tuples (match start, match stop), but its somewhat  
sluggish for my needs.  Is there a way to do it faster within python?

Sagar Damle
Graduate Student
Biology Division, Caltech
1200 E. California Blvd, 156-29
Pasadena, CA 91125

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/biology-in-python/attachments/20070925/72230c88/attachment.htm 


More information about the biology-in-python mailing list