<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Hi Brent,<br>

<br>

Your welcome. <br>

<br>

What you have found in the SequenceUtils.py module was Chris Hart's

code for reverse complementing DNA sequence. And the short answer is I

don't know... He did some benchmarking if I remember correctly on why

this was faster... I've CC'ed Chris as he probably has a better answer.

I think that particular module was written pre-2002 and therefore was

benchmarked against an older version of Python. Chris?<br>

<br>

Let me know if you have any other questions or request for

documentation and I'll see about writing some additional

examples/tutorials.<br>

<br>

-Brandon<br>

<br>

P.S. Chris, you might want to check out <a class="moz-txt-link-freetext" href="http://bio.scipy.org/">http://bio.scipy.org/</a><br>

<br>

Brent Pedersen wrote:

<blockquote

 cite="mid:e183a99d0801161607l380863dex36d3ece38b041486@mail.gmail.com"

 type="cite">

  <pre wrap="">hi, and thanks for making this available. i have immediate use for the

fasta indexing.

what's this in SequenceUtils.py? am i missing something or is

[seq]*len(seq) uneccessary. what's the advantage over python's

reversed?

def reverse(seq):

  """reverse a sequence"""

  return(''.join(map(operator.getitem, [seq]*len(seq),

                     xrange(len(seq)-1, -1, -1))))

On Jan 16, 2008 3:29 PM, Brandon King <a class="moz-txt-link-rfc2396E" href="mailto:kingb@caltech.edu">&lt;kingb@caltech.edu&gt;</a> wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">Hi All,

I recently needed to generate the sequence lengths for all Human and

Mouse refseqs mRNA's. I figured it was worth while to document how I did

it to try to get in the habit of making tutorials as I work on solving

specific problems.

I used the bhutils.Fasta module (LGPL license) that Joe Roden and myself

wrote for the BioHub project for the purpose of being able handle

extracting sequence from chromosome sized Fasta files w/o using up a lot

of memory. It also needed to handle extracting sequence from Fasta files

with many fasta sequences in it.

It is also very useful for selectively plucking sequences from multiple

fasta files w/ multiple sequences and combining them into a new

multi-sequence fasta file and it doesn't take much code to accomplish

these tasks. I will try to make tutorials for these as well.

It might be useful if someone wants to post similar tutorials for how

these would be done with biopython, pygr, or other packages as well.

bhutils is a collection of modules that were I found useful that could

easily be extracted from the BioHub project (making it a much lighter

weight package). <a class="moz-txt-link-freetext" href="http://woldlab.caltech.edu/html/bhutils/">http://woldlab.caltech.edu/html/bhutils/</a>.

The tutorial can be found here:

<a class="moz-txt-link-freetext" href="http://bio.scipy.org/wiki/index.php/Multisequence_fasta_sequence_lengths_with_bhutils">http://bio.scipy.org/wiki/index.php/Multisequence_fasta_sequence_lengths_with_bhutils</a>

-Brandon King

_______________________________________________

biology-in-python mailing list - <a class="moz-txt-link-abbreviated" href="mailto:bip@lists.idyll.org">bip@lists.idyll.org</a>.

See <a class="moz-txt-link-freetext" href="http://bio.scipy.org/">http://bio.scipy.org/</a> for our Wiki.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

  </pre>

</blockquote>

</body>

</html>