[TIP] testing: why bother?

C. Titus Brown ctb at msu.edu
Thu Mar 24 20:49:26 PDT 2011

On Wed, Mar 23, 2011 at 04:38:17PM +0000, Michael Foord wrote:
> On 23/03/2011 16:26, Michael Foord wrote:
>> On 23/03/2011 15:37, C. Titus Brown wrote:
>>> [snip...]
>>> Also I don't know of any actual evidence that TDD or agile actually  
>>> improves
>>> programming speed, ability, thought, or anything else, over other  
>>> methods of
>>> specification (more formal ones, for example). Anecdotes need not apply.
> These studies are the best I'm aware of on benefits of TDD:

[ ... ]

So, I spent some time hunting for *open* meta-discussions.  Couldn't find any,
but a chapter in _Making Software_ did a meta-analysis, summarized (by someone
other than the original authors) here:


I think the most interesting conclusion to take away from the _Making Software_
book (which is a great resource for these things) was the following statement:

Although most of the trials did not measure or control the amount of the TDD
pill taken (which in software parlance translates into a lack of attention to
process conformance), we believe that the dosage ended up being variable across
trials and subjects. Trials with poor or unknown constructs may not have
strictly enforced TDD usage, and we believe it is highly likely that the trial
participants customized the pill with a selection of ingredients rather than
following the strict textbook definition of TDD. This issue poses a serious
threat to drawing generalized conclusions.

In other words, since people customize development and testing practices
to their local conditions, to generalize you either need LOTS of evidence (to
average out across all of those specifics) or you need a set of controlled
studies (where the same approach is implemented in very similar ways for at
least a few projects).

The chapter is not easy to read but makes a bunch of other interesting points,

Available evidence from the trials suggests that TDD does not have a consistent
effect on internal quality. Although TDD appears to yield better results over
the control group for certain types of metrics (complexity and reuse), other
metrics (coupling and cohesion) are often worse in the TDD treatment. Another
observation from the trial data is that TDD yields production code that is less
complex at the method/class level, but more complex at the package/project
level. This inconsistent effect is more visible in more rigorous trials (i.e.,
L2 and L3 trials).

which fits with my own intuition, and of course the caveat clause:

The effects of TDD still involve many unknowns. Indeed, the evidence is not
undisputedly consistent regarding TDD’s effects on any of the measures we
applied: internal and external quality, productivity, or test quality. Much of
the inconsistency likely can be attributed to internal factors not fully
described in the TDD trials. Thus, TDD is bound to remain a controversial topic
of debate and research.


> The microsoft empirical study is interesting because they have large  
> scale development with varying practices, and are in a position to  
> compare those practises across projects.

It's worth pointing out that in many CSE circles, the argument that MS and IBM
have adopted a particular approach is all the argument that is needed to start
teaching that approach :)


You might also be interested in the ACM/IEEE 2009 update to the CS2001


which says,

The following topics tended to receive attention in the dialogue with industry:
quality issues
   o	testing, debugging and bug tracking
   o	checking on code readability and documentation
   o	code reviews
software engineering principles and techniques
   o	this included such matters as basic release management principles and basic source control
   o	best practices for developing software in teams

On the one hand, it's kind of sad that these things need to be said again
so late in the game; on the other hand, at least someone's saying them :)

C. Titus Brown, ctb at msu.edu

More information about the testing-in-python mailing list