<div dir="ltr"><div><div>You might have a better luck posting this to the general Python mailing list since the majority of people are there.<br></div>But do you happen to the the PEP regarding these issues?<br><br></div>It&#39;s interesting to hear the Python 3 decision. Though I have also been warned not to trust the output too in the past even back in 2.6<br>

<br><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Sep 30, 2013 at 6:19 AM, Ned Batchelder <span dir="ltr">&lt;<a href="mailto:ned@nedbatchelder.com" target="_blank">ned@nedbatchelder.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On 9/30/13 1:09 AM, Chris Jerdonek wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Sun, Sep 29, 2013 at 6:14 PM, Ned Batchelder &lt;<a href="mailto:ned@nedbatchelder.com" target="_blank">ned@nedbatchelder.com</a>&gt; wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On 9/29/13 4:19 PM, Chris Jerdonek wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Sun, Sep 29, 2013 at 12:20 PM, Ned Batchelder &lt;<a href="mailto:ned@nedbatchelder.com" target="_blank">ned@nedbatchelder.com</a>&gt;<br>

wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On 9/29/13 12:56 PM, Chris Jerdonek wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

I have a question about the behavior of hashing prior to Python 3.3<br>

(when hash randomization was turned on by default [1]).<br>

<br>

I know that in earlier versions Python never made any guarantees about<br>

hash values and their effect on dictionary key ordering, etc [2].  But<br>

for testing purposes, in practice, to what extent does hashing behave<br>

the same across systems and Python versions prior to Python 3.3?  For<br>

example, the note at [2] says that &quot;it typically varies between 32-bit<br>

and 64-bit builds.&quot;<br>

<br>

I&#39;m asking because I&#39;m curious about the extent to which tests that<br>

unknowingly depend on hash values are reproducible across systems and<br>

versions.<br>

</blockquote>

<br>

Tests like that are not reproducible across systems and versions. They<br>

may<br>

not be reproducible as the product code changes.  Two equal dicts may not<br>

iterate in the same order, even within a single process:<br>

</blockquote>

What explains the following then?  For quite a while, the unit tests<br>

for a project I maintained always passed for all versions, but only<br>

when I added 3.3 (when hash randomization was enabled) did I get<br>

intermittent failures on such a test.  Do some such tests tend to<br>

behave the same *in practice* -- even to a limited extent?  Otherwise,<br>

I would have expected to see test failures in the earlier versions, at<br>

least sometime.<br>

</blockquote>

<br>

Many dictionaries will behave the same across versions and systems. In the<br>

example I gave below, if the keys were integers instead of strings, the two<br>

dicts would iterate the same.  It all comes down to the hash values of your<br>

actual keys.<br>

<br>

When I said &quot;tests like that are not reproducible,&quot; I didn&#39;t mean that they<br>

would actually behave differently.  I meant that you couldn&#39;t count on them<br>

always behaving the same.<br>

<br>

Your test dictionaries happened to fall into a reproducible scenario.  The<br>

fact that they always behaved the same doesn&#39;t change the fact: you were<br>

relying on undefined behavior (the iteration sequence of dict keys).  Python<br>

3.3 shook things up enough for it to actually change the outcome of your<br>

program.<br>

</blockquote>

Hmm, there&#39;s a subtle distinction here.  There&#39;s a difference between<br>

the questions of whether equal dictionaries iterate to the same order<br>

and whether the same code yields the same values on different systems<br>

and versions.<br>

<br>

For example, I ran your sample code on Python 2.7, 3.2, 3.3 (on 3.3<br>

setting PYTHONHASHSEED=0), and PyPy 1.9, and all yielded the same<br>

representations of d1 and d2.  So if a test depended on that ordering<br>

of d2, it seems like it would be reproducible across versions at least<br>

for those versions.<br>

<br>

To rephrase my question, I&#39;m asking to what extent code can be<br>

expected or guaranteed to behave the same if the hash seed is the same<br>

(as it was by default prior to 3.3).  Can you provide an example of<br>

code that behaves differently on some systems or versions?<br>

</blockquote>

<br></div></div>

Using my same code example, but printing the keys more compactly, I get this:<br>

<br>

   $ python2.4 dictorder.py<br>

   d1: 1,0,3,2,5,4,7,6,9,8<br>

   d2: 9,2,1,5,0,3,4,6,7,8<br>

   $ python2.5 dictorder.py<br>

   d1: 1,0,3,2,5,4,7,6,9,8<br>

   d2: 9,1,5,2,0,3,4,6,7,8<br>

   $ jython dictorder.py<br>

   d1: 8,1,7,5,0,2,6,3,9,4<br>

   d2: 0,8,5,1,7,6,4,2,9,3<br>

<br>

I don&#39;t know what changed between 2.4 and 2.5, I just know that I wasn&#39;t allowed to complain about it.  Every version since 2.5 has agreed with 2.5.  Jython of course is a completely different beast. Dict iteration order isn&#39;t guaranteed.<span class="HOEnZb"><font color="#888888"><br>


<br>

--Ned.</font></span><div class="HOEnZb"><div class="h5"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I&#39;m asking because if a test is flaky because of its dependence on<br>

hash values, it would be useful to know that knowing the hash seed is<br>

enough to reliably reproduce the test failure (e.g. if some user or CI<br>

server encountered a sporadic failure).  Otherwise, reproducing the<br>

failure would be harder.<br>

<br>

--Chris<br>

<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

--Ned.<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

--Chris<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

d1 = dict.fromkeys(str(i) for i in range(10))<br>

d2 = dict.fromkeys(str(i) for i in range(1000000))<br>

for i in range(10, 1000000):<br>

</blockquote></blockquote></blockquote>

...   del d2[str(i)]<br>

...<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


d1 == d2<br>

</blockquote></blockquote></blockquote>

True<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


d1<br>

</blockquote></blockquote></blockquote>

{&#39;1&#39;: None, &#39;0&#39;: None, &#39;3&#39;: None, &#39;2&#39;: None, &#39;5&#39;: None, &#39;4&#39;: None, &#39;7&#39;:<br>

None, &#39;6&#39;: None, &#39;9&#39;: None, &#39;8&#39;: None}<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


d2<br>

</blockquote></blockquote></blockquote>

{&#39;9&#39;: None, &#39;1&#39;: None, &#39;5&#39;: None, &#39;2&#39;: None, &#39;0&#39;: None, &#39;3&#39;: None, &#39;4&#39;:<br>

None, &#39;6&#39;: None, &#39;7&#39;: None, &#39;8&#39;: None}<br>

<br>

Be careful out there...<br>

<br>

--Ned.<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

--Chris<br>

<br>

[1]<br>

<a href="http://docs.python.org/3/whatsnew/3.3.html#builtin-functions-and-types" target="_blank">http://docs.python.org/3/<u></u>whatsnew/3.3.html#builtin-<u></u>functions-and-types</a><br>

[2] <a href="http://docs.python.org/2/using/cmdline.html#cmdoption-R" target="_blank">http://docs.python.org/2/<u></u>using/cmdline.html#cmdoption-R</a><br>

<br>

______________________________<u></u>_________________<br>

testing-in-python mailing list<br>

<a href="mailto:testing-in-python@lists.idyll.org" target="_blank">testing-in-python@lists.idyll.<u></u>org</a><br>

<a href="http://lists.idyll.org/listinfo/testing-in-python" target="_blank">http://lists.idyll.org/<u></u>listinfo/testing-in-python</a><br>

</blockquote>

<br>

</blockquote></blockquote></blockquote></blockquote>

<br>

<br>

______________________________<u></u>_________________<br>

testing-in-python mailing list<br>

<a href="mailto:testing-in-python@lists.idyll.org" target="_blank">testing-in-python@lists.idyll.<u></u>org</a><br>

<a href="http://lists.idyll.org/listinfo/testing-in-python" target="_blank">http://lists.idyll.org/<u></u>listinfo/testing-in-python</a><br>

</div></div></blockquote></div><br></div>