[pygr-notify] Issue 49 in pygr: Initial delay for membership checking in seqdb.BlastDB

Thu Dec 11 09:34:21 PST 2008

Updates:
	Status: Accepted
	Cc: cjlee112

Comment #2 on issue 49 by cjlee112: Initial delay for membership checking  
in seqdb.BlastDB
http://code.google.com/p/pygr/issues/detail?id=49

OK, I have a better idea.  We can simply restrict this reindexing behavior  
to the
specific operation of looking up IDs during a BLAST search.  We only  
implemented this
behavior to deal with BLAST's buggy mangling of sequence IDs, so there's no  
need to
apply it in other situations.  If it isn't be applied at any other time,  
looking up
an ID that isn't in the database will simply fail (KeyError), with no delay.

Questions:
- should we do the initial reindexing at the same time as the formatdb  
step?  This
might reduce user annoyance, since users expect formatdb to take some time  
to reindex
the database.

- Should we print out a warning message explaining that we're reindexing  
the BLAST
database?  This might also reduce user annoyance / confusion, by clearing  
up the
mystery of "why is Pygr so slow?".

- Should we allow the user to turn off reindexing (which means that BLAST  
will not
work on NCBI databases with "mangled blob" IDs)?

- Can we auto-detect whether reindexing is needed (i.e. detect whether the  
sequence
IDs are blobs that blastall will mangle?).  Then we could dispense with it  
completely
on non-NCBI databases (or more specifically, databases whose IDs blastall  
won't mangle).

-- 
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings