[bip] mRNA lengths of all Human/Mouse refseqs (tutorial)
James Taylor
james at jamestaylor.org
Thu Jan 17 14:00:56 PST 2008
Now that is incredibly scary. Strings are immutable, the C api is
very clear on this, and for good reasons. And yet this code appears
to do exactly what it claims to do --
>>> f = "foobar"
>>> buffer(f)
<read-only buffer for 0x2affda05ae40, size -1, offset 0 at
0x2affda05bed8>
>>> rev.inplace_rev( f )
>>> f
'raboof'
>>> buffer(f)
<read-only buffer for 0x2affda05ae40, size -1, offset 0 at
0x2affda05bf10>
Bad bad bad! Interned strings really make this a mess. For example:
>>> f = "foobar"
>>> g = "foobar"
>>> f
'foobar'
>>> g
'foobar'
>>> id(f)
47278362832592
>>> id(g)
47278362832592
>>> buffer( f )
<read-only buffer for 0x2affda05aed0, size -1, offset 0 at
0x2affda05bed8>
>>> buffer( g )
<read-only buffer for 0x2affda05aed0, size -1, offset 0 at
0x2affda05bf48>
>>> rev.inplace_rev( f )
>>> f
'raboof'
>>> g
'raboof'
Yikes!
-- jt
On Jan 17, 2008, at 3:49 PM, Brent Pedersen wrote:
> since you did declare it a war. cython big guns make it another 2x as
> fast as seq[::-1].
>
> cdef extern from "stdio.h":
> cdef Py_ssize_t strlen(char *)
>
> def inplace_rev(char* seq):
> cdef int l = strlen(seq)
> cdef int i
> for i from 0 <= i < l / 2 :
> seq[i] , seq[l - i - 1] = seq[l - i - 1], seq[i]
More information about the biology-in-python
mailing list