[pygr-notify] [pygr commit] r22 - wiki

Fri May 30 17:50:32 PDT 2008

Author: cjlee112
Date: Fri May 30 17:50:28 2008
New Revision: 22

Added:
   wiki/DevelopmentRoadmap.wiki

Log:
Created wiki page through web user interface.

Added: wiki/DevelopmentRoadmap.wiki
==============================================================================

--- (empty file)
+++ wiki/DevelopmentRoadmap.wiki	Fri May 30 17:50:28 2008
@@ -0,0 +1,58 @@
+#summary Outline of plans for Pygr's future development.
+
+= Background =
+First, to put this all in context, let's review the new features added 
in the v0.7 release
+  * pygr.Data
+  * XMLRPC services (NLMSA, BlastDB)
+  * in-memory NLMSA
+  * many enhancements to NLMSA
+  * switched all alignment cases to use NLMSA interface
+  * many new features in other areas
+  * many bug fixes
+
+In short, an enormous number of different areas of new features were 
roped together under the "0.7" label.  This had a logical basis, namely 
that pygr.Data drove a wide-ranging unification of many different 
things so that they would all work seamlessly with pygr.Data.
+
+Going forward, I think it would be better for each "point release" to 
have a better defined (and smaller) scope of new capabilities.  This 
will make our development process a bit more disciplined and 
understandable to others, facilitate our testing, and make things go 
faster, I hope.  In this spirit, here are some ideas for the next point releases.
+
+= Planned Release Milestones =
+
+== v0.8: Enhancement & Clean-up Release ==
+V0.7 was very much a "developer release" in that it was a constant 
stream of new features.  I propose that v0.8 be more a "production 
release" that is less about new features and more about ensuring that 
the most used features -- alignment and annotation data -- work really 
smoothly and easily for a typical Python programmer.  I propose that 
this release would concentrate on the annotation and testing projects 
that we already have.  The goal for this release is to define several 
areas where developers other than myself can easily contribute to Pygr:
+  * add new sources of annotation data (e.g. Ensembl)
+  * add new sources of alignment data (e.g. programs like CLUSTAL etc.)
+  * add new resources to pygr.Data
+  * add web interfaces for searching / browsing contents of a 
pygr.Data server
+
+
+New features:
+  * pygr.Data auto-download capabilities (download=True mode)
+  * simple browsing / searching HTTP interface for pygr.Data server
+  * simple management functions over SSL XMLRPC
+  * XMLRPC annotation services
+  * classes supporting Ensembl annotation (in support of Jenny & Rob's project)
+  * additional UCSC alignment format(s), e.g. supporting hg18/hg17 mapping
+
+API Clean-up:
+  * define an "alignment parser API" that makes it simple to add a 
parser for an arbitrary alignment format, and have the results 
automatically loaded into an NLMSA.
+  * ensure that dict-like interfaces in Pygr properly support all the 
standard Mapping Protocol methods
+  * greatly expanded test suites courtesy of Rachel, Alex and Titus
+  * other clean-up and bug fixes
+
+== v0.9: NLMSA Joins ==
+In a 
[http://groups.google.com/group/pygr-dev/browse_thread/thread/0d4d02149022e5a6?hl=en 
number of discussions], we have already gone into some detail about new 
possible features for NLMSA.  The main idea is to define general 
purpose JOIN operations that execute in a high-speed, scalable way, 
obviating the need for researchers to write their own Python code (slow 
and possibly buggy) every time they need to find some intersection 
between two or more datasets.  The result of an NLMSA join will of 
course just be another NLMSA.  The goal is to make alignment and 
annotation query a killer app in terms of speed, query capabilities and 
ease of use.
+
+New features:
+  * ID standardization: adopting a convention for keeping IDs for 
sequence sets consistent across different NLMSAs will greatly speed up 
JOIN operations
+  * JOIN operations implemented in C
+  * more "Allen Logic" interval query operators for convenient query
+  * Python interfaces for join operations, possibly as a graph query
+
+== v1.0: pygr.Data Extensions ==
+pygr.Data is a powerful concept for sharing data, but needs additional 
tools and
+refinement.  Currently, it is not easy to manage pygr.Data resource 
databases, e.g. to copy resource or schema entries from one database to 
another, group them, delete them etc.
+
+New features:
+  * pygr.Data makerules: initially this could be implemented as simple 
construction rules a la SCons for building some type of resource from 
some types of dependencies.
+  * pygr.Data security via Signed Pickles: Python pickles create 
potential security vulnerabilities.  One simple and general solution is 
to create digitally signed pickles using OpenPGP that can be verified 
with the author's public key.  This would then integrate with standard 
tools (e.g. GnuPG) for listing sets of people whose content you trust.
+  * Pygr.Data management tools for viewing, copying, organizing, 
deleting resource and schema entries between pygr.Data resource 
databases.  These should work transparently with local resource 
databases, MySQL resource databases, and remote XMLRPC resource 
databases (all using authentication).
+  * DNS-like framework for servers to share name lookup info with each other