<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi All,<br>
<br>
Last night we had a biology birds of a feather meeting as part of SciPy
2007. I have included notes below from both Diane and myself. To
summarize, there were two general trends during the evening:<br>
<ol>
<li>Need to establish python/biology community, via website,
biology-in-python mailing list, rss, blogs, etc.</li>
<li>Having a core set of "interfaces" for handling basic
bioinformatics objects would allow independent projects to share these
basic objects. I am sure others will describe this better and in more
detail in the near future.</li>
</ol>
I have agreed to setup the python/biology community site. There are
some ideas in the notes below and I will also be posting ideas and
requesting ideas for this in a future post. <br>
<br>
Enthought has agreed to host our community site. We have the option of
using scipy.org sub-domain such as bio.scipy.org or we can choose a
domain name like biologyinpython.org. Any thoughts or preferences on
which one we should use?<br>
<br>
I will let others jump and provide some of the details/interests from
the meeting in more detail.<br>
<br>
-Brandon King<br>
<br>
<br>
-------------------- Brandon King's Notes -----------------------------<br>
Birds of a Feather: Biology<br>
---------------------------<br>
<br>
Chris: We could use some core package where all Biology Python packages
can<br>
build off of, but still do there own thing. This would allow for the
packages to<br>
pass data around in a compatible way.<br>
<br>
Share [complex] functionality.<br>
* graph db/pygr<br>
* common interface<br>
* sequence<br>
* sequence DB<br>
* alignment (--> annotation)<br>
* (BioPython seq_io)<br>
<br>
Parsing (Only need one / format).<br>
<br>
Large analysis management / parallel / cluster processing.<br>
* map / reduce impl?<br>
l = [ x, y, ... ]<br>
map(fn, l)<br>
reduce( l )<br>
* Parallelization in Python... mailing list.<br>
<br>
Other people's databases.<br>
<br>
<br>
Problems with BioPython:<br>
1) big, sprawling, interconnected.<br>
2) poor ... ???<br>
<br>
Parsing Issues:<br>
---------------<br>
<br>
* Blast<br>
* Hmmer<br>
<br>
Where is the community?<br>
* mailing list<br>
* wiki / website<br>
* RSS / blog / planet<br>
* extract?<br>
* use SciPy<br>
* "Don't suck." / easy_install<br>
* If you are interested in post datasets.<br>
* Coding standards<br>
1) testing (<br>
2) testing buildbot<br>
3) PEP 8 compliance<br>
4)<br>
<br>
<br>
Tutorials / entry documentation:<br>
--------------------------------<br>
<br>
* Redoing analysis.<br>
* How to distribute/write/host small projects (eggs)<br>
<br>
<br>
<br>
Common Theme:<br>
-------------<br>
<br>
Core interface so programs can play well together, while the
implementations<br>
can change. This allows the interface to be independent from the
storage.<br>
<br>
<br>
Databases:<br>
* NCBI eutils, etc.<br>
* Gene ontology<br>
* mammalian PO<br>
* UCSC/Ensembl<br>
* Integr8<br>
* Textspresso?<br>
<br>
<br>
Agreements:<br>
-----------<br>
* Brandon has agreed to setup the biology-in-python community website,
etc.<br>
<br>
<br>
<br>
---------------------- Diane Trout's Notes
--------------------------------<br>
<pre wrap="">* Introductions...
* Industry, 2
* Academic, 10
* Unknown, 1
* What should we do?
* Work on a common software
* Work on a common api, or at least define a common api
* Sharing complex functionality
* Graph Database
* Sequence Databse, common API
* common interface to the standard bioinformatics types
* Like sequence
* parsing (only need once per format)
* BLAST
* HMMER
*
* Biopython too monolithic
* Large Analysis Management Parallel/Cluster processing
* Map/Reduce impl
* Other peoples databases
* NCBI Eutils
* Gene Ontology
* mammalian phenotype ontology
* UCSC/ENSEMBL
* integr8
* raw textpresso database available (lexicons)
* Missing Data
* Microarray Formats
* R-BioConductor
* Problems with BioPython
* Big, Sprawling, Interconnected
* Poor Automated Testing
* unpythonic
* seems low-hanging fruit
* Python software
* Where is the community
* Mailing List
* Wiki
* Rss/Planet/Blog/planet
* bioinformatics.org
* use scipy
* Inclusivity
* how to distribute/write/share small projects
* "Dont Suck"
* Coding standards
* testing
* PEP8 compliance & docstrings
* setup.py distutils
* make sure they're easy installable
* if you want to publish your scripts & data, we will be willing
to help you host it
* Tutorials
* Entry documentation
* Good thing in BioPython
* Intro to how to use their blast parser
* Cookbook
* How to do the analysis of the paper in python
* One person argues that we shouldn't split things into too many fragments</pre>
<br>
</body>
</html>