[bip] Simple motif searching, and annotation visualization

Titus Brown titus at caltech.edu
Sat Dec 1 02:13:16 PST 2007


Hi all,

those of you interested in toy and not-so-toy visualization of
annotations might be interested in some code I banged together today;
see

	http://iorich.caltech.edu/~t/test/motifsearch.cgi/

for an interactive demonstration (that may not survive for more than a
few days...)

Briefly, this Web site takes a sequence and up to 5 IUPAC motifs, does
the motif searches, and displays the matches graphically in either PDF
or PNG form.

This is worth mentioning only because it solves a somewhat annoying
problem, the visualization of the display of overlapping annotations, in
a nice, fast, and general way: e.g. like genome browsers, it will not
write over features but will rather bump the feature down to avoid
collision.

For example, if you search for three overlapping motifs you will see
this:

	http://iorich.caltech.edu/~t/test/motifsearch.cgi/graph?motif5=&motif4=&motif1=wgatan&motif3=atag&sequence=agatagata&motif2=gata

The stacking solution is done with pygr so it is quite fast and *should*
generalize to large sequences -- or at least large enough sequences that
you probably don't want to be motif searching them through the Web.  The
drawing code is also quite general and doesn't require you to be doing
motif searches; any sequence intervals can be displayed, in any color,
in either PNG or PDF form.  See code sample below, and attached files,
for another example.

You can grab the code for this at

	http://iorich.caltech.edu/~t/transfer/pygr-draw-dec1-2007.tar.gz

It requires (at least) motility, PIL, ReportLab's PDF library, and pygr.

Be warned: attempts to understand this code may result in head
explosion.  Some of it is very, very ugly code.

Comments & thoughts welcome, of course ;)

cheers,
--titus

This code is used to create the images attached.

--
annotations1 = {}
annotations1['exon1'] = Annotation('exon1', sequence_name, 0, 50,
                                   color=colors.blue)
annotations1['exon2'] = Annotation('exon2', sequence_name, 200, 500,
                                   color=colors.green)
annotations1['exon3'] = Annotation('exon3', sequence_name, 250, 300,
                                   color=colors.black)

for i in range(250, 500, 10):
    name = 'exon%d' % (i + 4)
    start = i
    end = 2000
    annotations1[name] = Annotation(name, sequence_name, start, end)
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.pdf
Type: application/pdf
Size: 2066 bytes
Desc: not available
Url : http://lists.idyll.org/pipermail/biology-in-python/attachments/20071201/59569fa5/attachment-0001.pdf 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.png
Type: image/png
Size: 11957 bytes
Desc: not available
Url : http://lists.idyll.org/pipermail/biology-in-python/attachments/20071201/59569fa5/attachment-0001.png 


More information about the biology-in-python mailing list