[bip] Loading sequences from FASTA files

C. Titus Brown ctb at msu.edu
Tue Nov 24 06:38:15 PST 2009


On Tue, Nov 24, 2009 at 11:32:25AM +0000, James Casbon wrote:
> 2009/11/24 C. Titus Brown <ctb at msu.edu>:
> > Second, I don't think it's likely that a read-write relational database like
> > sqlite will, in the end, be faster than a read-only indexed flat file. ?This
> > is of little concern for small data sets like 454 :). ?However, I asked
> > Alex to design screed with a billion-sequence database in mind, from e.g.
> > the next generation of Illumina sequencers... and screed retrieval seems to be
> > constant with database size, so that's a good sign. ?I don't have a crystal
> > ball but it seems clear that our sequencing bonanza will only continue to
> > expand and I'd like to plan ahead a year or two.
> 
> Still doing that 'low input, high throughput, no output' science ;) -
> http://www.mc.vanderbilt.edu/reporter/index.html?ID=5027

Well, if all you *need* is a good idea, what happens when you combine
that good idea with $$ of sequencing? :)

> I would hold that a corollorary of Greenspun's 10th rule is that any
> indexed data format will contain an ad hoc informally-specified
> bug-ridden slow implementation of half of SQL.   You would only be
> using the read and not the write parts of sql here.

Nice!

--titus
-- 
C. Titus Brown, ctb at msu.edu



More information about the biology-in-python mailing list