[TIP] Which is the proper way to test an "unordered list"

Mon May 21 09:30:45 PDT 2012

Thanks Gregory,

That seems to be the ideal case for me. I'll start using it.

On Sat, May 19, 2012 at 2:22 AM, Gregory P. Smith <greg at krypto.org> wrote:
> On Fri, May 18, 2012 at 6:24 AM, Jorge Vargas <jorge.vargas at gmail.com>
> wrote:
>>
>> Thank you all for the answers.
>>
>> This got me thinking about my dataset
>> - The order is irrelevant, however it's good to have.
>> - There should be only one of each type but it is not really enforced.
>> Most checks here are in the for of "value in states" so if someone
>> makes the mistake of adding the same state twice it doesn't really
>> affect things.
>>
>> So a set seems to be the proper construction here. That said I could
>> stick with a list as there is no real need of the overhead of sets (I
>> know it's little) but it's really irrelevant to keep it as the only
>> case where the list will contain the same value twice is by programmer
>> error. So i believe the sorted(list) solution makes the most sense. It
>> will catch errors as a state not being "declared" (that's what the
>> decorator does), catch if the same state is added twice but will not
>> really care about the order.
>>
>
> See also assertItemsEqual (renamed to assertCountEqual in 3.2).
>
>  http://docs.python.org/library/unittest.html#unittest.TestCase.assertItemsEqual
>  http://docs.python.org/dev/library/unittest.html#unittest.TestCase.assertCountEqual
>
> -gps
>
>> On Thu, May 17, 2012 at 7:16 PM, Andrew Dalke <dalke at dalkescientific.com>
>> wrote:
>> > On May 17, 2012, at 3:30 AM, Jorge Vargas wrote:
>> >> My question is which will be the proper way to test for this when the
>> >> "order of the list" is not valid?
>> >
>> > The right answer depends on what the data set contains. Others have
>> > already suggested:
>> >
>> >  1) sorting; but that assumes that the objects can be compared
>> >      by inequality
>> >
>> >  2) converting to a set, but that assumes that objects are hashable,
>> >      and that duplicates can be ignored
>> >
>> > You can fix #2 by
>> >
>> >  3) use a collections.Counter; this still requires hashable objects
>> >       but doesn't have the count problem
>> >
>> >
>> > The only generic solution is:
>> >
>> >  4) Do the O(n*m) test of checking each item from one list to see
>> >       if it occurs in the other, removing matches as they are found
>> >
>> > Most people are uncomfortable with potentially quadratic worst-case
>> > times, but with only a couple of states, this isn't a problem.
>> >
>> >
>> > In your case, you can probably use the sets solution without a problem,
>> > especially if your "next" implementation verifies that there are no
>> > duplicates in the list. Or as Herman's commented, perhaps your
>> > implementation is more appropriate with a set in the first place?
>> >
>> > Cheers,
>> >
>> >                                Andrew
>> >                                dalke at dalkescientific.com
>> >
>> >
>> >
>> > _______________________________________________
>> > testing-in-python mailing list
>> > testing-in-python at lists.idyll.org
>> > http://lists.idyll.org/listinfo/testing-in-python
>>
>> _______________________________________________
>> testing-in-python mailing list
>> testing-in-python at lists.idyll.org
>> http://lists.idyll.org/listinfo/testing-in-python
>
>