[pygr-notify] [pygr commit] r247 - pygr.Data -> worldbase, except for historic entries
codesite-noreply at google.com
codesite-noreply at google.com
Fri Jun 26 14:23:48 PDT 2009
Author: marecki
Date: Fri Jun 26 14:22:33 2009
New Revision: 247
Added:
wiki/ServingDataUsingWorldbase.wiki (contents, props changed)
- copied, changed from r246, /wiki/ServingDataUsingpygrData.wiki
wiki/worldbaseIntroduction.wiki (contents, props changed)
- copied, changed from r246, /wiki/pygrDataIntroduction.wiki
Removed:
wiki/ServingDataUsingpygrData.wiki
wiki/pygrDataIntroduction.wiki
Modified:
wiki/CodeExamples.wiki
wiki/DataStorageUsingpygr.wiki
wiki/MegatestSetup.wiki
wiki/NlmsaFromAxtPairwise.wiki
wiki/PygrDocumentation.wiki
wiki/PygrOnEnsembl.wiki
wiki/PygrResourceDownloader.wiki
wiki/QuickOverview.wiki
Log:
pygr.Data -> worldbase, except for historic entries
Modified: wiki/CodeExamples.wiki
==============================================================================
--- wiki/CodeExamples.wiki (original)
+++ wiki/CodeExamples.wiki Fri Jun 26 14:22:33 2009
@@ -1,4 +1,4 @@
-#summary Code snippets demonstrating how to accomplish various tasks with
Pygr.
+#summary Code snippets demonstrating how to accomplish various tasks with
Pygr.
= Introduction =
@@ -10,10 +10,10 @@
The following code examples are presently available from the Pygr Wiki:
* DataStorageUsingpygr
* GenomeCalculationsUsingpygr
+ * [NlmsaFromAxtPairwise]
* [LocatingIntergenicRegionsWithinAGenome]
- * [pygrDataIntroduction]
* PygrResourceDownloader
* SearchingforPatterns
- * ServingDataUsingpygrData
+ * ServingDataUsingWorldbase
* [SimpleAnnotationDB]
- * [NlmsaFromAxtPairwise]
\ No newline at end of file
+ * [worldbaseIntroduction]
Modified: wiki/DataStorageUsingpygr.wiki
==============================================================================
--- wiki/DataStorageUsingpygr.wiki (original)
+++ wiki/DataStorageUsingpygr.wiki Fri Jun 26 14:22:33 2009
@@ -1,8 +1,8 @@
-#summary Storing data in a MySQL table and pygr.Data
+#summary Storing data in a MySQL table and worldbase
= Introduction =
-This article is an in-depth explanation of a script in which a genome and
the accompanying annotations are manipulated in multiple ways,including in
a MySQL table, and then stored in pygr.Data, a resource database. Storing
the data this way enables it to be easily manipulated using pygr and
prevents potential errors by allowing ease of access to the necessary
genomic information.
+This article is an in-depth explanation of a script in which a genome and
the accompanying annotations are manipulated in multiple ways,including in
a MySQL table, and then stored in worldbase, a resource database. Storing
the data this way enables it to be easily manipulated using pygr and
prevents potential errors by allowing ease of access to the necessary
genomic information.
*WARNING*: This is a code-example Wiki page and as such _may_ be out of
sync with current versions of Pygr. It will be removed or refactored once
our doctest infrastructure has been deployed.
@@ -211,22 +211,22 @@
annot_map.build()
}}}
-Docstrings are then created for the genome, the annotations, and the
annotation map so they may be stored in pygr.Data. pygr.Data requires
docstrings to be assigned to every resources stored within, to allow a more
descriptive storage of resources and to allow easier access.
+Docstrings are then created for the genome, the annotations, and the
annotation map so they may be stored in worldbase. worldbase requires
docstrings to be assigned to every resources stored within, to allow a more
descriptive storage of resources and to allow easier access.
-Finally, the genome, the annotation, and the annotation map is stored in
pygr.Data. Since the annotation map is a schema, its can be stored in
pygr.Data as a schema. In order to store schema in pygr.Data, the
relationship between the schema must be defined (Many-To-Many or
One-To-Many). The annotation map is saved in pygr.Dara first, then again
with the schema assignment. When saving the map as schema, the relationship
between the schema and the resources it references must also be made clear,
and the resources must be available in pygr.Data as well (you must save the
genome and annotations along with the annotation map).
+Finally, the genome, the annotation, and the annotation map is stored in
worldbase. Since the annotation map is a schema, its can be stored in
worldbase as a schema. In order to store schema in worldbase, the
relationship between the schema must be defined (Many-To-Many or
One-To-Many). The annotation map is saved in pygr.Dara first, then again
with the schema assignment. When saving the map as schema, the relationship
between the schema and the resources it references must also be made clear,
and the resources must be available in worldbase as well (you must save the
genome and annotations along with the annotation map).
-bindAttr can have up to three attribute names, although only one is used
here. 'annots' is bound to the objects of the source database (the
annotations are keys for the annotation map). The pygr.Data resources are
then stored to pygr.Data using the save() command, which is essential for
any session that modifies or adds pygr.Data resources.
+bindAttr can have up to three attribute names, although only one is used
here. 'annots' is bound to the objects of the source database (the
annotations are keys for the annotation map). The worldbase resources are
then stored to worldbase using the save() command, which is essential for
any session that modifies or adds worldbase resources.
{{{
genome.__doc__ = 'ecoli genome'
annots.__doc__ = 'ecoli annotations'
annot_map.__doc__ = 'annotation map'
-pygr.Data.Bio.Seq.Genome.ecoli = genome
-pygr.Data.Bio.Annotation.ecoli.annotations = annots
-pygr.Data.Bio.Annotation.ecoli.annotationmap = annot_map
-pygr.Data.schema.Bio.Annotation.ecoli.annotationmap = \
- pygr.Data.ManyToManyRelation(genome,annots,bindAttrs=('annots',))
+worldbase.Bio.Seq.Genome.ecoli = genome
+worldbase.Bio.Annotation.ecoli.annotations = annots
+worldbase.Bio.Annotation.ecoli.annotationmap = annot_map
+worldbase.schema.Bio.Annotation.ecoli.annotationmap = \
+ worldbase.ManyToManyRelation(genome,annots,bindAttrs=('annots',))
-pygr.Data.save()
+worldbase.save()
}}}
Modified: wiki/MegatestSetup.wiki
==============================================================================
--- wiki/MegatestSetup.wiki (original)
+++ wiki/MegatestSetup.wiki Fri Jun 26 14:22:33 2009
@@ -23,7 +23,7 @@
* [http://somethingaboutorange.com/mrl/projects/nose/ Nose] (megatests
haven't been rewritten for the new test framework yet);
- * _(optional)_ A local pygr.Data XML-RPC server, so that the
data-download test is not affected by the quality of your connection to the
UCLA one;
+ * _(optional)_ A local worldbase XML-RPC server, so that the
data-download test is not affected by the quality of your connection to the
UCLA one;
* Sequence data, miscellaneous input and reference output used by
megatests; obtaining and installing these will be described below.
Modified: wiki/NlmsaFromAxtPairwise.wiki
==============================================================================
--- wiki/NlmsaFromAxtPairwise.wiki (original)
+++ wiki/NlmsaFromAxtPairwise.wiki Fri Jun 26 14:22:33 2009
@@ -75,13 +75,13 @@
cnestedlist.NLMSA(pathstem=pathstem, mode='w', seqDict=genomeUnion,
axtFiles=axtlist, maxlen=536870912, maxint=22369620)
}}}
-If you are planning to save NLMSA into pygr.Data and never open directly
from file, you don't have to give additional options. For example:
+If you are planning to save NLMSA into worldbase and never open directly
from file, you don't have to give additional options. For example:
{{{
-import pygr.Data
+from pygr import worldbase
msa.__doc__ = "5-way alignment using axt pairwise files"
-pygr.Data.Bio.Alignment.HUMAN.hg18.hg18_pairwise5way = msa
-pygr.Data.save()
+worldbase.Bio.Alignment.HUMAN.hg18.hg18_pairwise5way = msa
+worldbase.save()
}}}
However, if you are planning to open NLMSA directly from file, the seqDict
should be saved into file by explicitly:
Modified: wiki/PygrDocumentation.wiki
==============================================================================
--- wiki/PygrDocumentation.wiki (original)
+++ wiki/PygrDocumentation.wiki Fri Jun 26 14:22:33 2009
@@ -38,7 +38,7 @@
* [http://bioinfo.mbi.ucla.edu/pygr/docs/ Pygr Versions, Publications,
Presentations]
= Talks =
- * [http://video.google.com/videoplay?docid=1813952225455171972 Pygr and
Pygr.Data talk] at UCLA Bioinformatics retreat
([http://www.doe-mbi.ucla.edu/~leec/talks/UCLA%20Bioinfo08.pdf slides in
PDF]): May 2008
+ * [http://video.google.com/videoplay?docid=1813952225455171972 Pygr and
pygr.Data talk] at UCLA Bioinformatics retreat
([http://www.doe-mbi.ucla.edu/~leec/talks/UCLA%20Bioinfo08.pdf slides in
PDF]): May 2008
* [http://bioinfo.mbi.ucla.edu/pygr/docs/SciPy07Lee.pdf SciPy 2007
presentation]: August 2007.
* [http://bioinfo.mbi.ucla.edu/pygr/docs/ISMB2006_PYGR_PPT.pdf ISMB 2006
Software Demo]: includes working examples from pygr tutorials.
* [http://bioinfo.mbi.ucla.edu/pygr/docs/pygr2005.pdf ISMB 2005
tutorial]: a quick intro to the goals of the Pygr project; this was
followed by running various code examples, more or less following the
tutorial examples in the docs.
Modified: wiki/PygrOnEnsembl.wiki
==============================================================================
--- wiki/PygrOnEnsembl.wiki (original)
+++ wiki/PygrOnEnsembl.wiki Fri Jun 26 14:22:33 2009
@@ -81,7 +81,7 @@
*-* specialized adaptor classes (subclasses of Pygr's sqlgraph.SQLTable
class): provides access to a specific sql table in an ensembl core database
-*-* private module methods: provide automatic saving of the Ensembl
database schema to pygr.Data
+*-* private module methods: provide automatic saving of the Ensembl
database schema to worldbase
*3.* the featuremapping module (featuremapping.py): provides mapping
between ensembl features
Modified: wiki/PygrResourceDownloader.wiki
==============================================================================
--- wiki/PygrResourceDownloader.wiki (original)
+++ wiki/PygrResourceDownloader.wiki Fri Jun 26 14:22:33 2009
@@ -3,42 +3,42 @@
*WARNING*: This is a code-example Wiki page and as such _may_ be out of
sync with current versions of Pygr. It will be removed or refactored once
our doctest infrastructure has been deployed.
-One can easily download pre-built pygr.Data resources into your localdisk.
Be sure to give writable path before XMLRPC server ('.' in PYGRDATAPATH).
+One can easily download pre-built worldbase resources into your localdisk.
Be sure to give writable path before XMLRPC server ('.' in WORLDBASEPATH).
{{{
import os
- os.environ['PYGRDATAPATH']
= '.,http://biodb2.bioinformatics.ucla.edu:5000'
- import pygr.Data
+ os.environ['WORLDBASEPATH']
= '.,http://biodb2.bioinformatics.ucla.edu:5000'
+ from pygr import worldbase
- pygr.Data.dir('') # RETURNS ALL XMLRPC RESOURCES
- pygr.Data.dir('', download=True) # RETURNS ALL DOWNLOADABLE RESOURCES
+ worldbase.dir('') # RETURNS ALL XMLRPC RESOURCES
+ worldbase.dir('', download=True) # RETURNS ALL DOWNLOADABLE RESOURCES
}}}
-For seqdb.BlastDB, you have to setup PYGRDATADOWNLOAD path.
+For seqdb.BlastDB, you have to setup WORLDBASEDOWNLOAD path.
{{{
- os.environ['PYGRDATADOWNLOAD'] = '/my/seqdb/path'
+ os.environ['WORLDBASEDOWNLOAD'] = '/my/seqdb/path'
- hg18 = pygr.Data.Bio.Seq.Genome.HUMAN.hg18(download=True)
+ hg18 = worldbase.Bio.Seq.Genome.HUMAN.hg18(download=True)
}}}
-Above line will initiate downloading and saving hg18 into your
PYGRDATADOWNLOAD path.
+Above line will initiate downloading and saving hg18 into your
WORLDBASEDOWNLOAD path.
-For NLMSA, you have to setup PYGRDATABUILDDIR path..
+For NLMSA, you have to setup WORLDBASEBUILDDIR path..
{{{
- os.environ['PYGRDATABUILDDIR'] = '/my/nlmsa/path'
+ os.environ['WORLDBASEBUILDDIR'] = '/my/nlmsa/path'
- hg18_multiz28way = pygr.Data.Bio.MSA.UCSC.hg18_multiz28way(download=True)
+ hg18_multiz28way = worldbase.Bio.MSA.UCSC.hg18_multiz28way(download=True)
}}}
-Above line will initiate downloading and saving hg18_multiz28way into your
PYGRDATABUILDDIR path.
+Above line will initiate downloading and saving hg18_multiz28way into your
WORLDBASEBUILDDIR path.
If you don't have huge disk space, don't forget to delete intermediate
compressed files and text files.
Of course, if you delete download=True option, it will access biodb2
XMLRPC resources.
{{{
- hg18 = pygr.Data.Bio.Seq.Genome.HUMAN.hg18()
- hg18_multiz28way = pygr.Data.Bio.MSA.UCSC.hg18_multiz28way()
+ hg18 = worldbase.Bio.Seq.Genome.HUMAN.hg18()
+ hg18_multiz28way = worldbase.Bio.MSA.UCSC.hg18_multiz28way()
}}}
Modified: wiki/QuickOverview.wiki
==============================================================================
--- wiki/QuickOverview.wiki (original)
+++ wiki/QuickOverview.wiki Fri Jun 26 14:22:33 2009
@@ -18,7 +18,7 @@
== Optional, Recommended ==
While pygr's core functionality only requires a sane python environment,
some specific features require additional software:
- * MySQL support: allows Pygr to access MySQL databases using its
pygr.sqlgraph module. Also needed for pygr.Data module support for storage
of pygr.Data resource databases in MySQL. Requirements: *MySQL-python
(MySQLdb module) >= 1.2.0; works with any server MySQL >= 3.23.x*
+ * MySQL support: allows Pygr to access MySQL databases using its
pygr.sqlgraph module. Also needed for worldbase module support for storage
of worldbase resource databases in MySQL. Requirements: *MySQL-python
(MySQLdb module) >= 1.2.0; works with any server MySQL >= 3.23.x*
* NCBI tools: used by the pygr.seqdb.BlastDB class to provide convenient
blast/megablast search. Requirements: *formatdb, blastall, megablast*, any
recent version which you can
[http://www.ncbi.nlm.nih.gov/IEB/ToolBox/index.cgi download from NCBI];
executables must be in your $PATH.
Copied: wiki/ServingDataUsingWorldbase.wiki (from r246,
/wiki/ServingDataUsingpygrData.wiki)
==============================================================================
--- /wiki/ServingDataUsingpygrData.wiki (original)
+++ wiki/ServingDataUsingWorldbase.wiki Fri Jun 26 14:22:33 2009
@@ -1,23 +1,23 @@
-#summary Creating an XML-RPC server though pygr.Data
+#summary Creating an XML-RPC server though worldbase
= Introduction =
-Using pygr.Data to store your resources is especially convenient when
attempting to access them remotely through a server, as the unique handles
assigned to the data when registered in pygr.Data ensure ease of access.
The server used here is an XML-RPC server, a server that encodes the data
using XML (Extensible Markup Language) and then HTTP as the data transport
method. Creating an XML-RPC server is very simple, and will allow the user
to retrieve databases stored in pygr.Data, even from independent computers.
+Using worldbase to store your resources is especially convenient when
attempting to access them remotely through a server, as the unique handles
assigned to the data when registered in worldbase ensure ease of access.
The server used here is an XML-RPC server, a server that encodes the data
using XML (Extensible Markup Language) and then HTTP as the data transport
method. Creating an XML-RPC server is very simple, and will allow the user
to retrieve databases stored in worldbase, even from independent computers.
*WARNING*: This is a code-example Wiki page and as such _may_ be out of
sync with current versions of Pygr. It will be removed or refactored once
our doctest infrastructure has been deployed.
= A Helpful Example =
-First, import pygr.Data, then reference the pygr.Data resource you wish to
serve. In this case, the reference is pygr.Data.Bio.Seq.Genome.ECOLI.ecoli.
A NLMSA is a data structure used to store the genome/sequence maps. The
alignment and sequence databases stored in the NLMSA can currently be
accessed by pygr.Data.
+Import worldbase from pygr, then reference the worldbase resource you wish
to serve. In this case, the reference is
worldbase.Bio.Seq.Genome.ECOLI.ecoli. A NLMSA is a data structure used to
store the genome/sequence maps. The alignment and sequence databases stored
in the NLMSA can currently be accessed by worldbase.
-Next, the server is assigned a name; this name will be used a layer name
within pygr.Data, as well as a port number. The port number can be set to
any number that is currently available. Finally, the server can be accessed
easily by the URL from any location, as long as the URL is set to the
PYGRDATAPATH. The default PYGRDATAPATH is
http://biodb2.bioinformatics.ucla.edu:5000, and thus if this remains
unchanged, the user will not be able to add or delete resources to/from
pygr.Data. Furthermore, the server must be assigned a name (like 'rachel')
that will also be used as the layer name for the pygr.Data resource when
attempting to access it remotely.
+Next, the server is assigned a name; this name will be used a layer name
within worldbase, as well as a port number. The port number can be set to
any number that is currently available. Finally, the server can be accessed
easily by the URL from any location, as long as the URL is set to the
WORLDBASEPATH. The default WORLDBASEPATH is
http://biodb2.bioinformatics.ucla.edu:5000, and thus if this remains
unchanged, the user will not be able to add or delete resources to/from
worldbase. Furthermore, the server must be assigned a name (like 'rachel')
that will also be used as the layer name for the worldbase resource when
attempting to access it remotely.
-In order to access the newly-created server from a remote location, the
server must be set as the PYGRDATAPATH. PYGRDATAPATH searches for pygr.Data
resources in three steps: 1) in the current directory; 2) in the home
directory; and 3) from the XMLRPC server. It is essential to assign the
server as to PYGRDATAPATH, or an error will result. The correct address to
give to PYGRDATAPATH would be the URL of your server
(http://somehost:1215), with somehost as the server address. Firewalls may
be present, and could potentially prevent access to the XML-RPC server, and
thus should be addressed as need be.
+In order to access the newly-created server from a remote location, the
server must be set as the WORLDBASEPATH. WORLDBASEPATH searches for
worldbase resources in three steps: 1) in the current directory; 2) in the
home directory; and 3) from the XMLRPC server. It is essential to assign
the server as to WORLDBASEPATH, or an error will result. The correct
address to give to WORLDBASEPATH would be the URL of your server
(http://somehost:1215), with somehost as the server address. Firewalls may
be present, and could potentially prevent access to the XML-RPC server, and
thus should be addressed as need be.
{{{
-import pygr.Data
-nlmsa = pygr.Data.Bio.Seq.Genome.ECOLI.ecoli()
-server = pygr.Data.getResource.newServer('rachel', withIndex=True,
port=1215)
+from pygr import worldbase
+nlmsa = worldbase.Bio.Seq.Genome.ECOLI.ecoli()
+server = worldbase.getResource.newServer('rachel', withIndex=True,
port=1215)
server.serve_forever()
}}}
Copied: wiki/worldbaseIntroduction.wiki (from r246,
/wiki/pygrDataIntroduction.wiki)
==============================================================================
--- /wiki/pygrDataIntroduction.wiki (original)
+++ wiki/worldbaseIntroduction.wiki Fri Jun 26 14:22:33 2009
@@ -1,8 +1,8 @@
-#summary A step-by-step example for adding data to pygr.Data
+#summary A step-by-step example for adding data to worldbase
= Introduction =
-This tutorial introduces pygr.Data, which allows for easy access to
multiple datasets by providing a consistent namespace or context for data.
This method of data retrieval enables users to manipulate large quantities
of data, potentially on multiple machines, without the added worry of
ensuring each computer can directly access the various filepaths. However,
it should be noted that pygr.Data is intended for higher-level data
resources, such as a MySQL table, BLAST sequence database, or a Python
dictionary or shelve, because pygr.Data is purposed to be a “database of
databases” rather than a substitute for a database.
+This tutorial introduces worldbase, which allows for easy access to
multiple datasets by providing a consistent namespace or context for data.
This method of data retrieval enables users to manipulate large quantities
of data, potentially on multiple machines, without the added worry of
ensuring each computer can directly access the various filepaths. However,
it should be noted that worldbase is intended for higher-level data
resources, such as a MySQL table, BLAST sequence database, or a Python
dictionary or shelve, because worldbase is purposed to be a “database of
databases” rather than a substitute for a database.
*WARNING*: This is a code-example Wiki page and as such _may_ be out of
sync with current versions of Pygr. It will be removed or refactored once
our doctest infrastructure has been deployed.
@@ -11,23 +11,21 @@
The E. coli genome sequence is stored in a BLAST database using seqdb.
BLAST (Basic Local Alignment Search Tool) databases are designed for
storing sequence alignments.
-pygr.Data is then imported to allow access to the data namespace. This is
an essential step, as pygr.Data must be previously imported in order to
store or access data from or in it. PYGRPDATAPATH must be set to the
directory in which it is located.
+worldbase is then imported to allow access to the data namespace. This is
an essential step, as worldbase must be previously imported in order to
store or access data from or in it. WORLDBASEPATH must be set to the
directory in which it is located.
In the following step, the data is stored in a container. There are many
options for this, including a MySQL table or a BLAST database as seen here.
-Furthermore, assigning a __doc__ string is extremely important, as the
data MUST have a __doc__ string, which describes the kind of data it is, so
that when a user looks at a directory listing of pygr.Data, he/she can
quickly ascertain what data is stored. A __doc__string (documentation
string) allows users to easily associate documentation with functions,
classes, and modules, which is especially convenient for pygr.Data, since
many databases could potentially be stored in it, and documentation ensures
clarity and unambiguity.
+Furthermore, assigning a __doc__ string is extremely important, as the
data MUST have a __doc__ string, which describes the kind of data it is, so
that when a user looks at a directory listing of worldbase, he/she can
quickly ascertain what data is stored. A __doc__string (documentation
string) allows users to easily associate documentation with functions,
classes, and modules, which is especially convenient for worldbase, since
many databases could potentially be stored in it, and documentation ensures
clarity and unambiguity.
-Finally, the data is stored in pygr.Data using the save() function. In all
pygr.Data sessions, it is essential to call the pygr.Data.save() function
to ensure all new data that has been added that session is committed.
Furthermore, it is imperative to observe the naming conventions for saving
data to pygr.Data, since not only does it assign a unique and consistent
name to the data, ensuring its easy import, but also since multiple users
could be using one pygr.Data database and the data should be clearly
organized.
+Finally, the data is stored in worldbase using the save() function. In all
worldbase sessions, it is essential to call the worldbase.save() function
to ensure all new data that has been added that session is committed.
Furthermore, it is imperative to observe the naming conventions for saving
data to worldbase, since not only does it assign a unique and consistent
name to the data, ensuring its easy import, but also since multiple users
could be using one worldbase database and the data should be clearly
organized.
{{{
-from pygr import seqdb
-
-import pygr.Data
+from pygr import seqdb, worldbase
ecoli = seqdb.BlastDB('/home/mccreary/Projects/pygr/data/CP000802.fna')
ecoli.__doc__ = 'ecoli genome sequence'
-pygr.Data.Bio.Seq.Genome.ECOLI.ecoli = ecoli
+worldbase.Bio.Seq.Genome.ECOLI.ecoli = ecoli
-pygr.Data.save()
+worldbase.save()
}}}
More information about the pygr-notify
mailing list