[pygr-notify] Issue 42 in pygr: KeyError for BLAST results that return no hits

codesite-noreply at google.com codesite-noreply at google.com
Thu Sep 25 13:46:58 PDT 2008


Issue 42: KeyError for BLAST results that return no hits
http://code.google.com/p/pygr/issues/detail?id=42

New issue report by kmdaily:
What steps will reproduce the problem?
g = pygr.Data.getResource("Bio.Seq.Genome.YEAST.sacCer")
s = Sequence("TCTTCCTCACTCTCAGGGT", "test")
r = genome.blast(s, maxseq=1)
r[s].edges()

What is the expected output? What do you see instead?
I guess the expected output would be an empty list....

I get:
---------------------------------------------------------------------------
<type 'exceptions.KeyError'>              Traceback (most recent call last)

/home/baldig/projects/genomics/svn/data/yeast/2008_9_16_12_23_28_1_260_1/yeast/locations/<ipython
console> in <module>()

/home/baldig/projects/genomics/svn/data/yeast/2008_9_16_12_23_28_1_260_1/yeast/locations/pygr.cnestedlist.pyx
in pygr.cnestedlist.NLMSA.__getitem__()

/home/dock/shared_libraries/lx64/pkgs/python/2.5.1/lib/python2.5/site-packages/pygr/nlmsa_utils.py
in __getitem__(self, seq)

/home/dock/shared_libraries/lx64/pkgs/python/2.5.1/lib/python2.5/site-packages/pygr/nlmsa_utils.py
in getSeqID(self, seq)

/home/dock/shared_libraries/lx64/pythonpkgs/2.5.1/pygr_0_7_1/pygr/seqdb.py
in __getitem__(self, seq)
    1456         'handles optional mode that adds seq if not already present'
    1457         try:
-> 1458             return self.getName(seq)
    1459         except KeyError:
    1460             if self.db.addAll:

/home/dock/shared_libraries/lx64/pythonpkgs/2.5.1/pygr_0_7_1/pygr/seqdb.py
in getName(self, seq)
    1450         except AttributeError: # NO db?  THEN TREAT AS A user  
SEQUENCE
    1451             userID='user'+self.db.separator+seq.pathForward.id
-> 1452             s=self.db[userID] # MAKE SURE ALREADY IN user SEQ
DICTIONARY
    1453             return userID # ALREADY THERE
    1454

/home/dock/shared_libraries/lx64/pythonpkgs/2.5.1/pygr_0_7_1/pygr/seqdb.py
in __getitem__(self, k)
    1356             prefix = t[0] # ASSUME PREFIX DOESN'T CONTAIN separator
    1357             id = k[len(prefix)+1:] # SKIP PAST PREFIX
-> 1358         d=self.prefixDict[prefix]
    1359         try: # TRY TO USE int KEY FIRST
    1360             return d[int(id)]

<type 'exceptions.KeyError'>: 'user'



What version of the product are you using? On what operating system?
0.7.1

Please provide any additional information below.
The output of the blastall command from the command line is below. The
g.blast returns no results because the e-value is above the threshold, but
the KeyError of "user" is strange and not meaningful. An empty list
returned would be more useful for not having results.

$ cat test
>test
TCTTCCTCACTCTCAGGGT
$ blastall -i test -p blastn -d ~baldig/projects/genomics/genomes/sacCer.fa
-v 1 -b 1
BLASTN 2.2.18 [Mar-02-2008]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= test
          (19 letters)

Database: /home/baldig/projects/genomics/genomes/sacCer.fa
            17 sequences; 12,156,677 total letters

Searching..................................................done



                                                                  Score    E
Sequences producing significant alignments:                      (bits)  
Value

chr7                                                                   26
  0.91

>chr7
           Length = 1090947

  Score = 26.3 bits (13), Expect = 0.91
  Identities = 13/13 (100%)
  Strand = Plus / Minus


Query: 2      cttcctcactctc 14
               |||||||||||||
Sbjct: 102012 cttcctcactctc 102000


   Database: /home/baldig/projects/genomics/genomes/sacCer.fa
     Posted date:  Sep 12, 2008  3:19 PM
   Number of letters in database: 12,156,677
   Number of sequences in database:  17

Lambda     K      H
     1.37    0.711     1.31

Gapped
Lambda     K      H
     1.37    0.711     1.31


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 5, Extension: 2
Number of Sequences: 17
Number of Hits to DB: 1171
Number of extensions: 50
Number of successful extensions: 13
Number of sequences better than 10.0: 7
Number of HSP's gapped: 13
Number of HSP's successfully gapped: 13
Length of query: 19
Length of database: 12,156,677
Length adjustment: 13
Effective length of query: 6
Effective length of database: 12,156,456
Effective search space: 72938736
Effective search space used: 72938736
X1: 11 (21.8 bits)
X2: 15 (29.7 bits)
X3: 50 (99.1 bits)
S1: 12 (24.3 bits)
S2: 12 (24.3 bits)




Issue attributes:
	Status: New
	Owner: ----
	Labels: Type-Defect Priority-Medium

-- 
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings



More information about the pygr-notify mailing list