[khmer] exceeding defined RAM limits?
Oh, Julia (NIH/NHGRI) [F]
julia.oh at nih.gov
Tue Dec 24 14:59:09 PST 2013
Results are in and the error reproduced:
The following commands yield:
python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 20 -k 20 -N 4 -x 60e9 --savehash round2.unaligned_ref.kh -R round2.unaligned_1.report round2.unaligned;
python2.7 /home/ohjs/khmer/scripts/filter-abund.py round2.unaligned_ref.kh round2.unaligned.keep;
python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 5 -k 20 -N 4 -x 16e9 round2.unaligned.keep.abundfilt;
This last command yields:
########
... kept 116741181 of 151000000 or 77%
... in file round2.unaligned.keep.abundfilt
... kept 116816167 of 151100000 or 77%
... in file round2.unaligned.keep.abundf-------- running PBS epilogue script (5081978.biobos p78 ohjs) --------
Show some job stats:
5081978.biobos elapsed time: 9485 seconds
5081978.biobos walltime: 02:37:52 hh:mm:ss
5081978.biobos memory limit: 69.14 GB
5081978.biobos memory used: 69.16 GB
5081978.biobos cpupercent used: 98.00 %
==================================================================================================
|| NOTE: this job was likely deleted by the batch system due to exceeding available memory. ||
==================================================================================================
#########
Thanks & happy holidays,
Julia
On Dec 18, 2013, at 10:46 AM, C. Titus Brown <ctb at msu.edu> wrote:
> On Wed, Dec 18, 2013 at 03:43:22PM +0000, Oh, Julia (NIH/NHGRI) [F] wrote:
>> [ohjs at helix khmer]$ git checkout master
>> Branch master set up to track remote branch master from origin.
>> Switched to a new branch 'master'
>> [ohjs at helix khmer]$ make
>>
>> ===> lots of stuff, ending with:
>>
>> copying build/lib.linux-x86_64-2.6/khmer/_khmermodule.so -> khmer
>> make[1]: Leaving directory `/home/ohjs/khmer/python'
>>
>> [ohjs at helix khmer]$ git branch
>> bleeding-edge
>> * master
>
> OK, great! This is the latest development version; can you see if you can
> reproduce the problem with it? (Sadly, I expect you will, as we haven't
> made many significant changes to normalize-by-median's machinery...)
>
> best,
> --titus
>
>> On Dec 18, 2013, at 8:10 AM, C. Titus Brown <ctb at msu.edu> wrote:
>>
>>> On Wed, Dec 18, 2013 at 03:07:57AM +0000, Oh, Julia (NIH/NHGRI) [F] wrote:
>>>> Titus?thanks for the tip on variable coverage; will definitely try that out.
>>>
>>> Great -- should significantly improve sensitivity to low coverage "stuff"!
>>>
>>>> Michael?pretty sure I did a git clone. The last date in my directory is Sept 5th?but not sure if that would be pull date or your last modified date.
>>>
>>> OK, and then one last check... did you check out the 'master' or 'legacy'
>>> branch? What does 'git branch' report?
>>>
>>> To check out master, do:
>>>
>>> git checkout master
>>> make
>>>
>>> cheers,
>>> --titus
>>>
>>>> On Dec 17, 2013, at 8:16 PM, Michael R. Crusoe <mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>> wrote:
>>>>
>>>> Hello Julia,
>>>>
>>>> What version of khmer are you using?
>>>>
>>>> That is, did you install via `pip` or a `git clone`?
>>>>
>>>>
>>>> On Tue, Dec 17, 2013 at 5:14 PM, C. Titus Brown <ctb at msu.edu<mailto:ctb at msu.edu>> wrote:
>>>> On Tue, Dec 17, 2013 at 04:36:34PM -0800, C. Titus Brown wrote:
>>>>> On Tue, Dec 17, 2013 at 07:53:18PM +0000, Oh, Julia (NIH/NHGRI) [F] wrote:
>>>>> Now, on to your real question :)
>>>>>
>>>>>> $python2.7 /home/ohjs/khmer/scripts/normalize-by-median.py -C 5 -k 20 -N 4 -x 16e9 round2.unaligned.keep.abundfilt;
>>>>>>
>>>>>> I thought I would be maxing out at 64 GB ram for the hash table (I?ve also used 32e9), but I get the following RAM usage report of
>>>>>>
>>>>>> 4986693.biobos elapsed time: 23358 seconds
>>>>>> 4986693.biobos walltime: 06:28:36 hh:mm:ss
>>>>>> 4986693.biobos memory limit: 249.00 GB
>>>>>> 4986693.biobos memory used: 249.76 GB
>>>>>> 4986693.biobos cpupercent used: 98.00 %
>>>>>
>>>>> What the heck!? That's not supposed to happen!
>>>>>
>>>>> This is either a bug, or (most likely) is being caused by an overabundance of
>>>>> high-abundance k-mers. The latter is easy to fix -- I've filed a bug report to
>>>>> fix the latter in the software overall [0] -- but would require you to modify
>>>>> the script at the moment. If you're up for that, put
>>>>>
>>>>> ht.set_use_bigcount(False)
>>>>>
>>>>> at line 186 of normalize-by-median:
>>>>
>>>> Darn it, that can't be the problem; I just wrote a test against this
>>>> behavior and we actually did things right in the script and ignored
>>>> high abundance k-mers.
>>>>
>>>> So, this must be a bug of some sort. Umm... Michael, any ideas?!
>>>>
>>>> cheers,
>>>> --titus
>>>> --
>>>> C. Titus Brown, ctb at msu.edu<mailto:ctb at msu.edu>
>>>>
>>>> _______________________________________________
>>>> khmer mailing list
>>>> khmer at lists.idyll.org<mailto:khmer at lists.idyll.org>
>>>> http://lists.idyll.org/listinfo/khmer
>>>>
>>>>
>>>>
>>>> --
>>>> Michael R. Crusoe: Software Engineer and Bioinformatician mcrusoe at msu.edu<mailto:mcrusoe at msu.edu>
>>>> @ the Genomics, Evolution, and Development lab; Michigan State University
>>>> http://ged.msu.edu/ http://orcid.org/0000-0002-2961-9670 @biocrusoe<http://twitter.com/biocrusoe>
>>>>
>>>
>>> --
>>> C. Titus Brown, ctb at msu.edu
>>
>
> --
> C. Titus Brown, ctb at msu.edu
More information about the khmer
mailing list