[bip] mRNA lengths of all Human/Mouse refseqs (tutorial)

Brent Pedersen bpederse at gmail.com
Thu Jan 17 14:26:57 PST 2008


On Jan 17, 2008 1:58 PM, Andrew Dalke <dalke at dalkescientific.com> wrote:
> On Jan 17, 2008, at 9:49 PM, Brent Pedersen wrote:
> > cython big guns make it another 2x as
> > fast as seq[::-1].
> >
> > cdef extern from "stdio.h":
> >     cdef Py_ssize_t strlen(char *)
> >
> > def inplace_rev(char* seq):
> >     cdef int l = strlen(seq)
> >     cdef int i
> >     for i from 0 <= i < l / 2 :
> >         seq[i] , seq[l - i - 1] =  seq[l - i - 1],  seq[i]
>
> I didn't know there was a spinoff from Pyrex - cool!
>
> However, doesn't that break a fundamental assumption in Python that
> strings are immutable?  The following should cause first breakage
> then major breakage:
>
> inplace_rev(string.ascii_letters)
>
> import __builtin__
> for k in dir(__builtin__):
>    strrev.inplace_rev(k)
>
>
> Because of the immutable constraint, the Python implementation is
> free to reuse existing strings, which is why the following happens
> (using my Pyrex implementation, below):
>
>  >>> import strrev
>  >>> s = "This"
>  >>> t = "This"
>  >>> strrev.inplace_rev(s)
>  >>> t
> 'sihT'
>  >>>
>
>
> There's also a problem if the string contains a NUL character, since
> you're using the C definition of string length and not Python's.  You
> should be able to call len(seq) directly instead of using strlen.
>
> Here's my version of your function.  I changed it slightly because I
> thought this was easier to inspect and verify that it works for even
> and odd lengths, and that it doesn't do an extra swap at the middle
> point of odd length strings.
>
> def inplace_rev(char* seq):
>      cdef int start, end
>      start = 0
>      end = len(seq)-1  # might not work if len(seq) > C's max signed int
>      while start < end:
>          seq[start] , seq[end] =  seq[end],  seq[start]
>          start = start + 1
>          end = end - 1
>
>
>                                 Andrew
>                                 dalke at dalkescientific.com
>
>
>
>
> _______________________________________________
> biology-in-python mailing list - bip at lists.idyll.org.
>
> See http://bio.scipy.org/ for our Wiki.
>

heh, agreed on all reservations about mutating a supposedly immutable
structure -- it's like playing god with DNA as such things can affect
any number of descendants -- so to speak.
not to mention the absurdity of trading 6 letters "[::-1]" for 10+
lines in an external compiled library.



More information about the biology-in-python mailing list