[bip] parallel recipe and bio libraries.

Brent Pedersen bpederse at gmail.com
Sun Feb 3 18:30:07 PST 2008


On Feb 3, 2008 6:11 PM, Bruce Southey <bsouthey at gmail.com> wrote:
> Hi,
> I have not really looked at the code in depth but Blast uses as many
> cpus as told to. Also it handle multiple sequences in a single file
> which, in theory, is meant to be more efficient. Also, disk IO is also
> a limiting factor especially with SMP (dual processors/cores) so I
> usually find there is no advantage in doing database formating in
> parallel.
>
> So what I am missing here?
>
> Regards
> Bruce
>

good point. though i find that -a option doesnt always work as (i
think it should). but is there a builtin way to queue jobs with ncbi
blast?
even if i were to only a single header per chromosomes (no 10kmers),
that'd be 144 jobs. and, i prefer to have the blast output go to
separate files.
i forgot to explain the 10kmers...
we store genomic sequence in our database as 10kmers so i often just
use it like that. especially with poorly annotated genomes where
trusting gene models is not a good idea.
would things go faster if i blasted against a single sequence?

thanks,
-b



More information about the biology-in-python mailing list