[TIP] unitteset outcomes (pass/fail/xfail/error), humans and automation

Mon Dec 14 07:53:12 PST 2009

I think the idea of user-defined statuses is a good one, but sometimes 
people are too quick to make up a new test status.  You need to make an 
easier way to get more information out of a test result.

There is a benefit to having a small number of possible statuses:  They 
group into meaningful categories.  A test that reports a status of 
"error" is doing something fundamentally different from a test that 
reports "fail".

But it is not true that "status" is the only way to get information out 
of a test.  To use an example mentioned on the list, I don't think 
"error" because of HTTP response 400 is so fundamentally different from 
"error" because of HTTP response 500 that it deserves a different status.

If you define too many status, at some point you dilute the usefulness.  
Let it run wild (say, by having many developers each making up new ways 
for a test to fail/error for a few years), and you might end up with the 
standard pass, fail, error... plus error_http_500, error_http_400, 
error_http_timeout, fail_floating_point_exception, fail_data_format, 
fail_syntax, fail_off_by_10_percent, fail_off_by_20_percent, ...

You could end in the interesting situation where you have 0 pass, 0 
fail, and 0 error, but 3 error_http_500, 1 error_http_400, 2 fail_fpe, 2 
fail_floating_point_exception (different from fpe how?), blah blah blah.

But you still need the information about the failure.  That is why we 
added the idea of "attributes" to pandokia.  The record that reports a 
test result can contain a "test result attribute" that conveys some 
information about the result.  So, for example, you might have

status=E
tra_http_response=400

or

status=E
tra_http_response=timeout

I see no particular reason why you couldn't add something like the 
pandokia attributes to unittest.  The nice thing about attributes is 
that they are a way to get arbitrary information out of the test, 
without the test system having to know anything about what it means.  
That is, I don't have to _define_ a new attribute -- I just have to use it:

    tra['http_response'] = str(httpobj.response)

If you formalize this idea and clearly document it, people will have an 
alternative for getting test-specific data in their result.  That means 
that they do not have an incentive to try to overload the status field 
by storing two (or more) different kinds of data in it.

In pandokia, we made Test Definition Attributes (TDA, something about 
the test inputs), Test Result Attributes (TRA, something computed during 
the test), and Test Configuration Attributes (TCA, something about the 
environment we are running in; not fully implemented yet, though).  
There are three name spaces because, in our work flow, they are 
maintained independently.  For example, I wouldn't want the test 
administrator (me) to set a new environment variable, add it to the 
attributes, then find that the name conflicts with an attribute computed 
by one of our thousands of nightly tests.  By keeping TCA and TRA in 
different name spaces, you never have to worry about that collision.

The name of an attribute is an arbitrary non-case-sensitive string[1], 
and the value is a string.  In nosetests in Python, we collect the test 
results through a nose plugin that recognizes variables tda, tra, and 
tca either as globals in the test module or as attributes on the 
unittest object, depending on which kind of test nose is running.  I 
didn't work on the nose plugin, but I seem to remember that all you 
would need to add tda/tra/tca to unittest is a way to report them when 
the test is done.

So, to summarize my point:  Yes, you should offer a way to create new 
values for the test status, but you should offer an even easier way to 
get information about what happened in a particular test, so that people 
are not tempted to create a new test status to encode information that 
doesn't really belong to the status.

Mark S.

[1] Why non-case-sensitive?  I had problems with people using 
inconsistent camel casing.  And can you really argue that it is a good 
idea to name one field "someRaNdomFieldName" and another 
"somEraNdomFieldName"?