[khmer] Duration of do-partition.py (very long !) (Alexis Groppi)
Alexis Groppi
alexis.groppi at u-bordeaux2.fr
Wed Mar 20 08:43:20 PDT 2013
Hi Eric,
Actually the previous job was terminated by the limit of the walltime.
I relaunched the script.
qstat -fr gives :
resources_used.cput = 93:23:08
resources_used.mem = 12341932kb
resources_used.vmem = 13271372kb
resources_used.walltime = 04:42:39
At this moment only the file.info has been generated.
Let's wait and see ...
Thanks again
Alexis
Le 19/03/2013 21:50, Eric McDonald a écrit :
> Hi Alexis,
>
> What does:
> qstat -f <job-id>
> where <job-id> is the ID of your job tell you for the following fields:
> resources_used.cput
> resources_used.vmem
>
> And how do those values compare to actual amount of elapsed time for
> the job, the amount of physical memory on the node, and the total
> memory (RAM + swap space) on the node?
> Just checking to make sure that everything is running as it should be
> and that your process is not heavily into swap or something like that.
>
> Thanks,
> Eric
>
>
>
> On Tue, Mar 19, 2013 at 11:23 AM, Alexis Groppi
> <alexis.groppi at u-bordeaux2.fr <mailto:alexis.groppi at u-bordeaux2.fr>>
> wrote:
>
> Hi Adina,
>
> First of all thanks for your answer and your advices :)
> The script extract-partitions.py works !
> For the do-partition.py on my second set, it runs since 32 hours.
> Should it not have produced at least one temporary .pmap file ?
>
> Thanks again
>
> Alexis
>
> Le 19/03/2013 12:58, Adina Chuang Howe a écrit :
>>
>>
>> Message: 1
>> Date: Tue, 19 Mar 2013 10:41:45 +0100
>> From: Alexis Groppi <alexis.groppi at u-bordeaux2.fr
>> <mailto:alexis.groppi at u-bordeaux2.fr>>
>> Subject: [khmer] Duration of do-partition.py (very long !)
>> To: khmer at lists.idyll.org <mailto:khmer at lists.idyll.org>
>> Message-ID: <514832D9.7090207 at u-bordeaux2.fr
>> <mailto:514832D9.7090207 at u-bordeaux2.fr>>
>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>
>> Hi Titus,
>>
>> After digital normalization and filter-below-abund, upon your
>> advice I
>> performed do.partition.py <http://do.partition.py> on 2 sets
>> of data (approx 2.5 millions of
>> reads (75 nt)) :
>>
>> /khmer-BETA/scripts/do-partition.py -k 20 -x 1e9
>> /ag/khmer/Sample_174/174r1_prinseq_good_bFr8.fasta.keep.below.graphbase
>> /ag/khmer/Sample_174/174r1_prinseq_good_bFr8.fasta.keep.below
>> and
>> /khmer-BETA/scripts/do-partition.py -k 20 -x 1e9
>> /ag/khmer/Sample_174/174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase
>> /ag/khmer/Sample_174/174r2_prinseq_good_1lIQ.fasta.keep.below
>>
>> For the first one I got a
>> 174r1_prinseq_good_bFr8.fasta.keep.below.graphbase.info
>> <http://174r1_prinseq_good_bFr8.fasta.keep.below.graphbase.info>
>> with the
>> information : 33 subsets total
>> Thereafter 33 files .pmap from 0.pmap to 32.pmap regurlarly
>> were created
>> and finally I got unique file
>> 174r1_prinseq_good_bFr8.fasta.keep.below.part (all the .pmap
>> files were
>> deleted)
>> This treatment lasted approx 56 hours.
>>
>> For the second set (174r2), do-partition.py is started since
>> 32 hours
>> but I only got the
>> 174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase.info
>> <http://174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase.info>
>> with the
>> information : 35 subsets total
>> And nothing more...
>>
>> Is this duration "normal" ?
>>
>>
>> Yes, this is typical. The longest I've had it run is 3 weeks for
>> very large (billions of reads). In general, partitioning is the
>> most time consuming of all the steps. Once its finished, you'll
>> have much smaller files which can be assembled very quickly.
>> Since I run assembly on multiple assembler and with multiple K
>> lengths, this gain is often significant for me.
>>
>> To get the actual partitioned files, you can use the following
>> script:
>>
>> https://github.com/ged-lab/khmer/blob/master/scripts/extract-partitions.py
>>
>> (The parameters for the threads are by default (4 threads))
>> 33 subsets and only one file at the end ?
>> Should I stop do-partition.py on the second set and re run it
>> with more
>> threads ?
>>
>>
>> I'd suggest letting it run.
>>
>> Best,
>> Adina
>>
>>
>> _______________________________________________
>> khmer mailing list
>> khmer at lists.idyll.org <mailto:khmer at lists.idyll.org>
>> http://lists.idyll.org/listinfo/khmer
>
> --
>
> _______________________________________________
> khmer mailing list
> khmer at lists.idyll.org <mailto:khmer at lists.idyll.org>
> http://lists.idyll.org/listinfo/khmer
>
>
>
>
> --
> Eric McDonald
> HPC/Cloud Software Engineer
> for the Institute for Cyber-Enabled Research (iCER)
> and the Laboratory for Genomics, Evolution, and Development (GED)
> Michigan State University
> P: 517-355-8733
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130320/a6bb9979/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130320/a6bb9979/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Signature_Mail_A_Groppi.png
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130320/a6bb9979/attachment-0005.png>
More information about the khmer
mailing list