<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Yes the steady ballooning is quite obvious, espescially if I take
some time staring at top command output etc. Thank you for your time
I will then hope that someone will look at this stuff here. As a
note might it be that my graafik.ht is corrupted somehow or
something? It is even smaller by size than the 50m.ht which I was
nicely able to partition, as additional information to anybody
interested the data used was ~36M 250 bp reads. <br>
Jens-Konrad<br>
<div class="moz-cite-prefix">On 04/13/2013 05:35 AM, Eric McDonald
wrote:<br>
</div>
<blockquote
cite="mid:CAGhFaV38cmW-mLXOQE5uffopXbvmK-=w8tauntMghLppO=1TXQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div style="">Jens-Konrad,</div>
<div><br>
</div>
Thanks for providing this information.
<div> 15: <span style="color:rgb(0,0,0);white-space:pre-wrap">resources_used.mem
= 52379536kb</span></div>
<div><font color="#000000"><span style="white-space:pre-wrap">
30: </span></font><span
style="color:rgb(0,0,0);white-space:pre-wrap">resources_used.mem
= 90676068kb</span></div>
<div><font color="#000000"><span style="white-space:pre-wrap">
45: </span></font><span
style="color:rgb(0,0,0);white-space:pre-wrap">resources_used.mem
= 122543188kb</span></div>
<div><font color="#000000"><span style="white-space:pre-wrap">Definitely
some ballooning memory use there.<br>
</span></font>
<div><br>
</div>
<div style="">One more thing you may wish to examine from the
command line is:</div>
<div style=""> qmgr -c "l s" | grep 'resources_'</div>
<div style="">This will tell you about any default resources
(such as physical memory) that your PBS server is assigning
to new jobs. That said, I do believe that your jobs are
exhausting available memory.</div>
</div>
<div style="">So, now the question is whether anything can be
done about it. Unless someone with more experience with the
partitioning code decides to speak up, I am going to have
analyze your chosen parameters and the pieces of code in
question to see if I can deduce anything. I might not be able
to do this until Monday - I am too tired to do it tonight
(here in US Eastern time) and have a busy weekend ahead of
me. </div>
<div style=""><br>
</div>
<div style="">I promise I will get back to you with some better
answers if no one else decides to say anything. While you are
waiting for a response and if you want to test your hypothesis
about the number of threads correlating to increased memory
use, then I would recommend using a smaller data set and
seeing what kind of scaling in the memory use you see as you
change the number of threads.</div>
<div style=""><br>
</div>
<div style="">Have a good weekend,</div>
<div style=""> Eric</div>
<div style=""><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Fri, Apr 12, 2013 at 7:30 AM,
Jens-Konrad Preem <span dir="ltr"><<a
moz-do-not-send="true" href="mailto:jpreem@ut.ee"
target="_blank">jpreem@ut.ee</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div class="im">
<div>On 04/11/2013 02:58 AM, Eric McDonald wrote:<br>
</div>
</div>
<div>
<div class="h5">
<blockquote type="cite">
<div dir="ltr">Forgot to reply to all, in case the
answer will help anyone else on the list....<br>
<br>
<div class="gmail_quote">---------- Forwarded
message ----------<br>
From: <b class="gmail_sendername">Eric McDonald</b>
<span dir="ltr"><<a moz-do-not-send="true"
href="mailto:emcd.msu@gmail.com"
target="_blank">emcd.msu@gmail.com</a>></span><br>
Date: Wed, Apr 10, 2013 at 7:57 PM<br>
Subject: Re: [khmer] parition-graph memory
requirements<br>
To: Jens-Konrad Preem <<a
moz-do-not-send="true"
href="mailto:jpreem@ut.ee" target="_blank">jpreem@ut.ee</a>><br>
<br>
<br>
<div dir="ltr">Hi,
<div><br>
</div>
<div> Sorry for the delayed reply.</div>
<div><br>
</div>
<div>Thanks for sharing your job scripts. I
notice that you are specifying the 'vmem'
resource. However, if PBS is also enforcing
a limit on the 'mem' resource (physical
memory), then you may be encountering that
limit. Do you know what default value is
assigned by your site's PBS server for the
'mem' resource?</div>
<div><br>
</div>
<div>Again, if you run:</div>
<div> qstat -f <job_id></div>
<div>you should be able to determine both the
resources allocated for the job and how much
the job is actually using. Please let us
know the results of this command, if you
would like help interpreting them and
figuring out how to change your PBS resource
request, if necessary.</div>
<div><br>
</div>
<div>As a side note, smaller k-mer lengths
mean that more k-mers are being extracted
from each sequence. This means that the hash
tables are being more densely populated.
And, that means that you are more likely to
need larger hash tables to avoid a
significant false positive rate. But, I
think a better thing to say is that the
amount of memory used by the hash tables is
independent of k-mer size. So, changing
k-mer length does not affect memory usage
for many parts of khmer. (I would have to
look more closely to see how this affects
the partitioning code.)</div>
<div><br>
</div>
<div>Hope that helps,</div>
<div> Eric</div>
<div><br>
</div>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Wed, Apr 10,
2013 at 4:23 AM, Jens-Konrad Preem <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:jpreem@ut.ee"
target="_blank">jpreem@ut.ee</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hi,<br>
<br>
In an extreme act of foolishness I
do seem to have lost my error logs.
(I have been messing with the
different scripts here a lot and
so got rid of some of the outputs,
in some ill thought out
"housekeeping" event).<br>
<br>
I do attach here a bunch of PBS
scripts that I used to get as far as
I am. I did use a different script
for most of the normalize and
partition pipeline, so I'd have time
to look at the outputs and get a
sense of time taken for each. The
scripts are in following order -
supkhme(normalize),
suprem(filter-below),
supload(load-graph), and finally
supart(partition-graph). (As can be
seen I try to do the meta-genome
analysis as per the guide.txt)<br>
All the previous scripts completed
without complaint, producing the 5.2
Gb "graafik" graph.<br>
<br>
The partition graph had failed a few
times after running an hour or so
always with error messages
concerning memory. Now the latest
script there demands 240 Gb of
memory which is maximum I can demand
in the near future, and still failed
with an error message concerning
memory.<br>
<br>
I am right now working on
reproducing the error, so I can then
supply you with .logs and .error
files, when no error occurs the
better for me of course.<br>
I decided to try different k-values
this time as suggested by <a
moz-do-not-send="true"
href="https://khmer.readthedocs.org/en/latest/guide.html"
target="_blank">https://khmer.readthedocs.org/en/latest/guide.html</a>
(20 for normalization, and 32 for
partitioning) those should make the
graph file all the bigger - I used
the smaller ones to avoid running
out of memory but as it doesn't seem
to help then what the heck. ;D.
Right now I am at the load-graph
stage with the new set. As it will
complete in few hours I'll put the
partition-graph on the run and then
we will see if it dies within an
hour. If so I'll post a new set of
scripts and logs.<br>
<br>
Thank you for your time,<br>
Jens-Konrad
<div>
<div><br>
<br>
<br>
<br>
<div>On 04/10/2013 04:18 AM,
Eric McDonald wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Jens-Konrad,
<div><br>
</div>
<div>Sorry for the delayed
response. (I was on
vacation yesterday and
hoping that someone more
familiar with the
partitioning code would
answer.)</div>
<div><br>
</div>
<div>My understanding of the
code is that decreasing
the subset size will
increase the number of
partitions but will not
change the overall graph
coverage. Therefore, I
would not expect it to
lower memory requirements.
(The overhead from
additional partitions
might raise them some, but
I have not analyzed the
code deeply enough to say
one way or another about
that.) As far as changing
the number of threads
goes, each thread does
seem to maintain a local
list of traversed k-mers
(hidden in the C++
implementation) but I do
not yet know how much that
would impact memory usage.
Have you tried using a
fewer number of threads?</div>
<div><br>
</div>
<div>But, rather than
guessing about causation,
let's try to get some more
diagnostic information.
Does the script die
immediately? (How long
does the PBS job execute
before failure?) Can you
attach the output and
error files for a job, and
also the job script? What
does</div>
<div> qstat -f
<job_id></div>
<div>where <job_id> is
the ID of your running
job, tell you about memory
usage?</div>
<div><br>
</div>
<div>Thanks,</div>
<div> Eric</div>
<div><br>
</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On
Mon, Apr 8, 2013 at 3:34
AM, Jens-Konrad Preem <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:jpreem@ut.ee" target="_blank">jpreem@ut.ee</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">Hi,<br>
I am having trouble with
completing a
partition-graph.py job.<br>
No matter the
configurations It seems
to terminate with error
messages hinting at low
memory etc. *<br>
Does LOWering the subset
size reduce the memory
use, what about LOWering
the amount of parallel
threads?<br>
The <a
moz-do-not-send="true"
href="http://graafik.ht" target="_blank">graafik.ht</a> is 5.2G large, I
had the script running
as a PBS job with 240 GB
RAM allocated. (That's
as much as I can get it,
maybe I'll have an
opportunity in the next
week to double it, but I
wouldn't count on it).<br>
Is it expected for the
script to require so
much RAM, or is there
some bug or some misuse
by my part. Would there
be any configuration to
get past this?<br>
<br>
Jens-Konrad Preem, MSc.,
University of Tartu<br>
<br>
<br>
<br>
* the latest
configuration after I
thought on smaller
subset size<br>
./khmer/scripts/partition-graph.py
--threads 24
--subset-size 1e4
graafik<br>
terminated with<br>
cannot allocate memory
for thread-local data:
ABORT<br>
<br>
<br>
_______________________________________________<br>
khmer mailing list<br>
<a
moz-do-not-send="true"
href="mailto:khmer@lists.idyll.org" target="_blank">khmer@lists.idyll.org</a><br>
<a
moz-do-not-send="true"
href="http://lists.idyll.org/listinfo/khmer" target="_blank">http://lists.idyll.org/listinfo/khmer</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software
Engineer</div>
<div> for the Institute
for Cyber-Enabled
Research (iCER)</div>
<div> and the Laboratory
for Genomics, Evolution,
and Development (GED)</div>
<div>Michigan State
University</div>
<div>P: <a
moz-do-not-send="true"
href="tel:517-355-8733" value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
<span><font color="#888888">
<pre cols="72">--
Jens-Konrad Preem, MSc, University of Tartu</pre>
</font></span></div>
<br>
_______________________________________________<br>
khmer mailing list<br>
<a moz-do-not-send="true"
href="mailto:khmer@lists.idyll.org"
target="_blank">khmer@lists.idyll.org</a><br>
<a moz-do-not-send="true"
href="http://lists.idyll.org/listinfo/khmer"
target="_blank">http://lists.idyll.org/listinfo/khmer</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software Engineer</div>
<div> for the Institute for
Cyber-Enabled Research (iCER)</div>
<div> and the Laboratory for Genomics,
Evolution, and Development (GED)</div>
<div>Michigan State University</div>
<div>P: <a moz-do-not-send="true"
href="tel:517-355-8733"
value="+15173558733" target="_blank">517-355-8733</a></div>
</div>
</div>
</div>
</div>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software Engineer</div>
<div> for the Institute for Cyber-Enabled
Research (iCER)</div>
<div> and the Laboratory for Genomics,
Evolution, and Development (GED)</div>
<div>Michigan State University</div>
<div>P: <a moz-do-not-send="true"
href="tel:517-355-8733" value="+15173558733"
target="_blank">517-355-8733</a></div>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
khmer mailing list
<a moz-do-not-send="true" href="mailto:khmer@lists.idyll.org" target="_blank">khmer@lists.idyll.org</a>
<a moz-do-not-send="true" href="http://lists.idyll.org/listinfo/khmer" target="_blank">http://lists.idyll.org/listinfo/khmer</a>
</pre>
</blockquote>
</div>
</div>
OK.<br>
I post a failed run complete with PBS script, error log.,
and qstat-f snapshots at different times.<br>
I find it weird that I managed to complete the test run on
iowa-corn50M which had a graph file even larger. Might the
number of used threads pump up the memory? I used the
sample commands from the web-page for corn. These used 4
threads at max. <br>
<span class="HOEnZb"><font color="#888888"> Jens-Konrad
Preem<br>
</font></span></div>
<br>
_______________________________________________<br>
khmer mailing list<br>
<a moz-do-not-send="true"
href="mailto:khmer@lists.idyll.org">khmer@lists.idyll.org</a><br>
<a moz-do-not-send="true"
href="http://lists.idyll.org/listinfo/khmer"
target="_blank">http://lists.idyll.org/listinfo/khmer</a><br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div dir="ltr">
<div>Eric McDonald</div>
<div>HPC/Cloud Software Engineer</div>
<div> for the Institute for Cyber-Enabled Research (iCER)</div>
<div> and the Laboratory for Genomics, Evolution, and
Development (GED)</div>
<div>Michigan State University</div>
<div>P: 517-355-8733</div>
</div>
</div>
</blockquote>
<br>
<pre class="moz-signature" cols="72">--
Jens-Konrad Preem, MSc, University of Tartu</pre>
</body>
</html>