<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    A precision : <br>
    <br>
    The file submitted to the script do-partition.py contains 2576771
    reads (file.below)<br>
    The job was launched with the following options : <br>
    khmer-BETA/scripts/do-partition.py -k 20 -x 1e9 -T 20 file.graphbase
    file.below<br>
    <br>
    Alexis<br>
    <br>
    <br>
    <div class="moz-cite-prefix">Le 21/03/2013 10:13, Alexis Groppi a
      &eacute;crit&nbsp;:<br>
    </div>
    <blockquote cite="mid:514ACF3A.9020006@u-bordeaux2.fr" type="cite">
      <meta content="text/html; charset=ISO-8859-1"
        http-equiv="Content-Type">
      Hi Eric,<br>
      <br>
      The script&nbsp; do-partition.py is now running since 22 hours.<br>
      Only the file.info has been generated. No .pmap file were created.<br>
      <br>
      qstat -f gives :<br>
      &nbsp;&nbsp;&nbsp; resources_used.cput = 441:04:21<br>
      &nbsp;&nbsp;&nbsp; resources_used.mem = 12764228kb<br>
      &nbsp;&nbsp;&nbsp; resources_used.vmem = 13926732kb<br>
      &nbsp;&nbsp;&nbsp; resources_used.walltime = 22:05:56<br>
      <br>
      The amount of RAM on the server is 256 Go and the swap space is
      also 256 Go<br>
      <br>
      Your opinion ?<br>
      <br>
      Thanks<br>
      <br>
      Alexis<br>
      <br>
      <div class="moz-cite-prefix">Le 20/03/2013 16:43, Alexis Groppi a
        &eacute;crit&nbsp;:<br>
      </div>
      <blockquote cite="mid:5149D918.8080602@u-bordeaux2.fr" type="cite">
        <meta content="text/html; charset=ISO-8859-1"
          http-equiv="Content-Type">
        Hi Eric,<br>
        <br>
        Actually the previous job was terminated by the limit of the
        walltime.<br>
        I relaunched the script.<br>
        qstat -fr gives :&nbsp;&nbsp;&nbsp; <br>
        &nbsp;&nbsp;&nbsp; resources_used.cput = 93:23:08<br>
        &nbsp;&nbsp;&nbsp; resources_used.mem = 12341932kb<br>
        &nbsp;&nbsp;&nbsp; resources_used.vmem = 13271372kb<br>
        &nbsp;&nbsp;&nbsp; resources_used.walltime = 04:42:39<br>
        <br>
        At this moment only the file.info has been generated.<br>
        <br>
        Let's wait and see ...<br>
        <br>
        Thanks again<br>
        <br>
        Alexis<br>
        <br>
        <br>
        <div class="moz-cite-prefix">Le 19/03/2013 21:50, Eric McDonald
          a &eacute;crit&nbsp;:<br>
        </div>
        <blockquote
cite="mid:CAGhFaV3U77wRhRZ5dfZ1xrqjdbnS51pWcBR+cDnZ8phsXy-Sxw@mail.gmail.com"
          type="cite">
          <div dir="ltr">Hi Alexis,
            <div><br>
            </div>
            <div style="">What does:</div>
            <div style="">&nbsp; qstat -f &lt;job-id&gt;</div>
            <div style="">where &lt;job-id&gt; is the ID of your job
              tell you for the following fields:</div>
            <div style="">&nbsp;&nbsp;resources_used.cput</div>
            <div style="">&nbsp;&nbsp;resources_used.vmem</div>
            <div style=""><br>
            </div>
            <div style="">And how do those values compare to actual
              amount of elapsed time for the job, the amount of physical
              memory on the node, and the total memory (RAM + swap
              space) on the node?</div>
            <div style="">Just checking to make sure that everything is
              running as it should be and that your process is not
              heavily into swap or something like that.</div>
            <div style=""><br>
            </div>
            <div style="">Thanks,</div>
            <div style="">&nbsp; Eric</div>
            <div style=""><br>
            </div>
          </div>
          <div class="gmail_extra"><br>
            <br>
            <div class="gmail_quote">On Tue, Mar 19, 2013 at 11:23 AM,
              Alexis Groppi <span dir="ltr">&lt;<a
                  moz-do-not-send="true"
                  href="mailto:alexis.groppi@u-bordeaux2.fr"
                  target="_blank">alexis.groppi@u-bordeaux2.fr</a>&gt;</span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0 0 0
                .8ex;border-left:1px #ccc solid;padding-left:1ex">
                <div text="#000000" bgcolor="#FFFFFF"> Hi Adina,<br>
                  <br>
                  First of all thanks for your answer and your advices
                  :)<br>
                  The script extract-partitions.py works !<br>
                  For the do-partition.py on my second set, it runs
                  since 32 hours. Should it not have produced at least
                  one temporary .pmap file ?<br>
                  <br>
                  Thanks again<br>
                  <br>
                  Alexis<br>
                  <br>
                  <div>Le 19/03/2013 12:58, Adina Chuang Howe a &eacute;crit&nbsp;:<br>
                  </div>
                  <blockquote type="cite">
                    <div>
                      <div class="h5"><br>
                        <br>
                        <div class="gmail_quote">
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex"> Message: 1<br>
                            Date: Tue, 19 Mar 2013 10:41:45 +0100<br>
                            From: Alexis Groppi &lt;<a
                              moz-do-not-send="true"
                              href="mailto:alexis.groppi@u-bordeaux2.fr"
                              target="_blank">alexis.groppi@u-bordeaux2.fr</a>&gt;<br>
                            Subject: [khmer] Duration of do-partition.py
                            (very long !)<br>
                            To: <a moz-do-not-send="true"
                              href="mailto:khmer@lists.idyll.org"
                              target="_blank">khmer@lists.idyll.org</a><br>
                            Message-ID: &lt;<a moz-do-not-send="true"
                              href="mailto:514832D9.7090207@u-bordeaux2.fr"
                              target="_blank">514832D9.7090207@u-bordeaux2.fr</a>&gt;<br>
                            Content-Type: text/plain;
                            charset="iso-8859-1"; Format="flowed"<br>
                            <br>
                            Hi Titus,<br>
                            <br>
                            After digital normalization and
                            filter-below-abund, upon your advice I<br>
                            performed <a moz-do-not-send="true"
                              href="http://do.partition.py"
                              target="_blank">do.partition.py</a> on 2
                            sets of data (approx 2.5 millions of<br>
                            reads (75 nt)) :<br>
                            <br>
                            /khmer-BETA/scripts/do-partition.py -k 20 -x
                            1e9<br>
/ag/khmer/Sample_174/174r1_prinseq_good_bFr8.fasta.keep.below.graphbase<br>
/ag/khmer/Sample_174/174r1_prinseq_good_bFr8.fasta.keep.below<br>
                            and<br>
                            /khmer-BETA/scripts/do-partition.py -k 20 -x
                            1e9<br>
/ag/khmer/Sample_174/174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase<br>
/ag/khmer/Sample_174/174r2_prinseq_good_1lIQ.fasta.keep.below<br>
                            <br>
                            For the first one I got a<br>
                            <a moz-do-not-send="true"
                              href="http://174r1_prinseq_good_bFr8.fasta.keep.below.graphbase.info"
                              target="_blank">174r1_prinseq_good_bFr8.fasta.keep.below.graphbase.info</a>
                            with the<br>
                            information : 33 subsets total<br>
                            Thereafter 33 files .pmap from 0.pmap to
                            32.pmap regurlarly were created<br>
                            and finally I got unique file<br>
                            174r1_prinseq_good_bFr8.fasta.keep.below.part

                            (all the .pmap files were<br>
                            deleted)<br>
                            This treatment lasted approx 56 hours.<br>
                            <br>
                            For the second set (174r2), do-partition.py
                            is started since 32 hours<br>
                            but I only got the<br>
                            <a moz-do-not-send="true"
                              href="http://174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase.info"
                              target="_blank">174r2_prinseq_good_1lIQ.fasta.keep.below.graphbase.info</a>
                            with the<br>
                            information : 35 subsets total<br>
                            And nothing more...<br>
                            <br>
                            Is this duration "normal" ?<br>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>Yes, this is typical. &nbsp;The longest I've
                            had it run is 3 weeks for very large
                            (billions of reads). &nbsp;In general,
                            partitioning is the most time consuming of
                            all the steps. &nbsp;Once its finished, you'll
                            have much smaller files which can be
                            assembled very quickly. &nbsp;Since I run
                            assembly on multiple assembler and with
                            multiple K lengths, this gain is often
                            &nbsp;significant for me. &nbsp;</div>
                          <div><br>
                          </div>
                          <div>To get the actual partitioned files, you
                            can use the following script:</div>
                          <div><br>
                          </div>
                          <div><a moz-do-not-send="true"
href="https://github.com/ged-lab/khmer/blob/master/scripts/extract-partitions.py"
                              target="_blank">https://github.com/ged-lab/khmer/blob/master/scripts/extract-partitions.py</a></div>
                          <div><br>
                          </div>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex"> (The
                            parameters for the threads are by default (4
                            threads))<br>
                            33 subsets and only one file at the end ?<br>
                            Should I stop do-partition.py on the second
                            set and re run it with more<br>
                            threads ?<br>
                            <br>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>I'd suggest letting it run.</div>
                          <div><br>
                          </div>
                          <div>Best,</div>
                          <div>Adina</div>
                        </div>
                        <br>
                        <fieldset></fieldset>
                        <br>
                      </div>
                    </div>
                    <pre>_______________________________________________
khmer mailing list
<a moz-do-not-send="true" href="mailto:khmer@lists.idyll.org" target="_blank">khmer@lists.idyll.org</a>
<a moz-do-not-send="true" href="http://lists.idyll.org/listinfo/khmer" target="_blank">http://lists.idyll.org/listinfo/khmer</a><span class="HOEnZb"><font color="#888888">
</font></span></pre>
                    <span class="HOEnZb"><font color="#888888"> </font></span></blockquote>
                  <span class="HOEnZb"><font color="#888888"> <br>
                      <div>-- <br>
                        <img
                          src="cid:part11.00010602.04080502@u-bordeaux2.fr"
                          border="0"></div>
                    </font></span></div>
                <br>
                _______________________________________________<br>
                khmer mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:khmer@lists.idyll.org">khmer@lists.idyll.org</a><br>
                <a moz-do-not-send="true"
                  href="http://lists.idyll.org/listinfo/khmer"
                  target="_blank">http://lists.idyll.org/listinfo/khmer</a><br>
                <br>
              </blockquote>
            </div>
            <br>
            <br clear="all">
            <div><br>
            </div>
            -- <br>
            <div dir="ltr">
              <div>Eric McDonald</div>
              <div>HPC/Cloud Software Engineer</div>
              <div>&nbsp; for the Institute for Cyber-Enabled Research (iCER)</div>
              <div>&nbsp; and the Laboratory for Genomics, Evolution, and
                Development (GED)</div>
              <div>Michigan State University</div>
              <div>P: 517-355-8733</div>
            </div>
          </div>
        </blockquote>
        <br>
        <div class="moz-signature">-- <br>
          <img src="cid:part14.00010106.04080307@u-bordeaux2.fr"
            border="0"></div>
      </blockquote>
      <br>
      <div class="moz-signature">-- <br>
        <img src="cid:part15.00060603.00090809@u-bordeaux2.fr"
          border="0"></div>
    </blockquote>
    <br>
    <div class="moz-signature">-- <br>
      <img src="cid:part16.00080803.00070208@u-bordeaux2.fr" border="0"></div>
  </body>
</html>