[khmer] Fwd: How to speed up the filter-below-abund script ?

Wed Mar 13 14:58:45 PDT 2013

Forwarding my earlier reply to the list, since I didn't reply-to-all
earlier.

Also, Alexis, you may wish to change the following in your job script:
  #PBS -l nodes=1:ppn=1
to
  #PBS -l nodes=1:ppn=8
assuming that you have 8-core nodes available. 'filter-below-abund.py' uses
8 threads by default; if a 'khmer' job runs on the same node as another
job, it may try using more CPU cores than it was allocated and that could
create problems with your systems administrators. And, if a job's threads
are restricted to the requested number of cores, then you will also not be
getting optimal performance by using more threads (8) than available cores
(1).

---------- Forwarded message ----------
From: Eric McDonald <emcd.msu at gmail.com>
Date: Wed, Mar 13, 2013 at 3:12 PM
Subject: Re: [khmer] How to speed up the filter-below-abund script ?
To: alexis.groppi at u-bordeaux2.fr

Alexis,

I just realized that the floating-point exception is from inside the Python
interpreter itself. If the floating-point exception had appeared from
within the 'filter-below-abund.py' script, then we shoul have seen a
traceback from the exception, ending with:
  ZeroDivisionError: float division by zero
Instead, we are seeing:
  line 49: 54757 Floating point exception(core dumped)
from your job shell. (I should've noticed that earlier.)

Would you please add the following lines to your job script somewhere
before you invoke 'filter-below-abund.py':
  python --version
  which python

And would you please add the following line _immediately after_ you invoke
'filter-below-abund.py':
  echo "Exit Code: $?"

Also, would you remove the 'time' command from in front of your invocation
of 'filter-below-abund.py'?

And, one more action before trying again... please run:
  git pull
in your 'khmer-BETA' directory. (I added another possible fix to the
'bleeding-edge' branch. This command will pull that fix into your clone.)

Thank you,
  Eric

On Wed, Mar 13, 2013 at 10:13 AM, Alexis Groppi <
alexis.groppi at u-bordeaux2.fr> wrote:

>  Hi,
>
> Le 13/03/2013 14:12, Eric McDonald a écrit :
>
> Hi Alexis,
>
>  First, let me say thank you for being patient and working with us in
> spite of all the problems you are encountering.
>
>
> That's bioinformatician life ;)
>
>
>
>  With regards to the floating point exception, I see several
> opportunities for a division-by-zero condition in the threading utilities
> used by the script. These opportunities exist if an input file is empty.
> (The problem may be coming from another place, but this would be my first
> guess.) What does the following command say:
>
>    ls -lh /scratch/ag/khmer/174r1_table.kh /mnt/var/home/ag/174r1_
> prinseq_good_bFr8.fasta.keep
>
>
>  The result : (the files are not empty)
> -rw-r--r-- 1 ag users 299M 12 mars  20:54
> /mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep
> -rw-r--r-- 1 ag users 141G 12 mars  21:05 /scratch/ag/khmer/174r1_table.kh
>
>
>
> Also, since you appear to be using TORQUE as your resource manager/batch
> system, could you please attach the complete output and error files for the
> job? (These files should be of the form <job_name>.o2693 and
> <job_name>.e2693, where <job_name> is the name of your job. There may only
> be one or the other of these files, depending on site defaults and whether
> you specified "-j oe" or "-j eo" in your job submission.)
>
>
> I re run the job since I have deleted previous (2693) err/out files.
> Here is the new file (merged with the option -j oe in the bash script) :
>
> #############################
> User: ag
> Date: Wed Mar 13 14:59:21 CET 2013
> Host: rainman.cbib.u-bordeaux2.fr
> Directory: /mnt/var/home/ag
> PBS_JOBID: 2695.rainman
> PBS_O_WORKDIR: /mnt/var/home/ag
> PBS_NODEFILE:  rainman
> #############################
> #############################
> Debut filter-below-abund: Wed Mar 13 14:59:21 CET 2013
>
> starting threads
> starting writer
> loading...
> ... filtering 0
> /var/lib/torque/mom_priv/jobs/2695.rainman.SC: line 49: 54757 Floating
> point exception(core dumped) ./khmer-BETA/sandbox/fi
> lter-below-abund.py /scratch/ag/khmer/174r1_table.kh/mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep
>
> real    3m54.873s
> user    0m0.085s
> sys     2m2.180s
> Date fin: Wed Mar 13 15:03:15 CET 2013
> Job finished
>
> Thanks again for your help :)
>
> Alexis
>
>
>
>  Thanks,
>   Eric
>
>
>
> On Wed, Mar 13, 2013 at 5:38 AM, Alexis Groppi <
> alexis.groppi at u-bordeaux2.fr> wrote:
>
>>  Hi Eric,
>>
>> Thanks for your answer.
>> But unfortunately, after many attempts I'm getting this error :
>>
>> starting threads
>> starting writer
>> loading...
>> ... filtering 0
>> /var/lib/torque/mom_priv/jobs/2693.rainman.SC: line 46: 63657 Floating
>> point exception(core dumped) ./khmer-BETA/sandbox/filter-below-abund.py
>> /scratch/ag/khmer/174r1_table.kh/mnt/var/home/ag/174r1_prinseq_good_bFr8.fasta.keep
>>
>> real    3m30.163s
>> user    0m0.088s
>>
>> Your opinion ?
>>
>> Thanks
>>
>> Alexis
>>
>>
>> Le 13/03/2013 00:55, Eric McDonald a écrit :
>>
>> Hi Alexis,
>>
>>  One way to get the 'bleeding-edge' branch is to clone it into a fresh
>> directory; for example:
>>    git clone http://github.com/ged-lab/khmer.git -b bleeding-edge
>> khmer-BETA
>>
>>  Assuming you already have a clone of the 'ged-lab/khmer' repo, then you
>> should also be able to do:
>>   git fetch origin
>>   git checkout bleeding-edge
>> Depending on how old your Git client is and what its defaults are, you
>> may have to do the following instead:
>>   git checkout --track -b bleeding-edge origin/bleeding-edge
>>
>>  Hope this helps,
>>   Eric
>>
>>
>> On Tue, Mar 12, 2013 at 11:32 AM, Alexis Groppi <
>> alexis.groppi at u-bordeaux2.fr> wrote:
>>
>>>
>>> Le 12/03/2013 16:16, C. Titus Brown a écrit :
>>>
>>> On Tue, Mar 12, 2013 at 04:15:05PM +0100, Alexis Groppi wrote:
>>>
>>>  Hi Titus,
>>>
>>> Thanks for your answer
>>> Actually it's my second attempt with filter-below-abund.
>>> The first time, I thought the problem was coming from the location of my  table.kh file : in a storage element with poor level performance of I/O
>>> I killed the job after 24h, moved the file in a best place and re run it
>>> But with the same result : no completion after 24h
>>>
>>> Any Idea ?
>>>
>>> Thanks
>>>
>>> Cheers From Bordeaux :)
>>>
>>> Alexis
>>>
>>> PS : The command line was the following :
>>>
>>> ./filter-below-abund.py 174r1_table.kh 174r1_prinseq_good_bFr8.fasta.keep
>>>
>>> Is this correct ?
>>>
>>>  Yes, looks right... Can you try with the bleeding-edge branch, which now
>>> incorporates a potential fix for this issue?
>>>
>>>  From here : https://github.com/ged-lab/khmer/tree/bleeding-edge ?
>>> or
>>> here : https://github.com/ctb/khmer/tree/bleeding-edge ?
>>>
>>> Do I have to make a fresh install ? and How  ?
>>> Or just replace all the files and folders ?
>>>
>>> Thanks :)
>>>
>>> Alexis
>>>
>>>
>>>  thanks,
>>> --titus
>>>
>>>
>>>  Le 12/03/2013 14:41, C. Titus Brown a ?crit :
>>>
>>>  On Tue, Mar 12, 2013 at 10:48:03AM +0100, Alexis Groppi wrote:
>>>
>>>  Metagenome assembly :
>>> My data :
>>> - original (quality filtered) data : 4463243 reads (75 nt) (Illumina)
>>> 1/ Single pass digital normalization with normalize-by-median (C=20)
>>> ==> file .keep of 2560557 reads
>>> 2/ generated a hash table by load-into-counting on the .keep file
>>> ==> file .kh of ~16Go (huge file ?!)
>>> 3/ filter-below-abund with C=100 from the two previous file (table.kh
>>> and reads.keep)
>>> Still running after 24 hours  :(
>>>
>>> Any advice to speed up this step ? ... and the others (partitionning ...) ?
>>>
>>> I can have an access to a HPC : ~3000 cores.
>>>
>>>  Hi Alexis,
>>>
>>> filter-below-abund and filter-abund have occasional bugs that prevent them
>>> from completing.  I would kill and restart.  For that few reads it should
>>> take no more than a few hours to do everything.
>>>
>>> Most of what khmer does cannot easily be distributed across multiple chassis,
>>> note.
>>>
>>> best,
>>> --titus
>>>
>>>  --
>>>
>>>
>>>   --
>>>
>>> _______________________________________________
>>> khmer mailing list
>>> khmer at lists.idyll.org
>>> http://lists.idyll.org/listinfo/khmer
>>>
>>>
>>
>>
>>  --
>>  Eric McDonald
>> HPC/Cloud Software Engineer
>>   for the Institute for Cyber-Enabled Research (iCER)
>>   and the Laboratory for Genomics, Evolution, and Development (GED)
>> Michigan State University
>> P: 517-355-8733
>>
>>
>>   --
>>
>
>
>
>  --
>  Eric McDonald
> HPC/Cloud Software Engineer
>   for the Institute for Cyber-Enabled Research (iCER)
>   and the Laboratory for Genomics, Evolution, and Development (GED)
> Michigan State University
> P: 517-355-8733
>
>
> --
>

-- 
Eric McDonald
HPC/Cloud Software Engineer
  for the Institute for Cyber-Enabled Research (iCER)
  and the Laboratory for Genomics, Evolution, and Development (GED)
Michigan State University
P: 517-355-8733

-- 
Eric McDonald
HPC/Cloud Software Engineer
  for the Institute for Cyber-Enabled Research (iCER)
  and the Laboratory for Genomics, Evolution, and Development (GED)
Michigan State University
P: 517-355-8733
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130313/407638b0/attachment-0002.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130313/407638b0/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130313/407638b0/attachment-0007.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 29033 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/khmer/attachments/20130313/407638b0/attachment-0008.png>