[TIP] Test coverage and parsing.

Mon Oct 12 10:23:03 PDT 2009

2009/10/12 Olemis Lang <olemis at gmail.com>:
> The only warning I want to make about this is that I don't like the
> workflow where people create models and generate both code and tests
> using it ...

OK, it doesn't seem to be the case here - properties of the system are
not spelled explicitly in the code - otherwise why codify them using
tests?

To bring a classical example: "reverse" function has a property that
for any list reverse(reverse(list)) == list. Model of any algorithm
implementing reverse doesn't contain this property - it's something
outside of the model. It's the kind of property that emerges from a
correct implementation of reverse.

Notice function "id" (or "identity") also has this property, which
suggests that a description of a single property won't do - you need a
whole set of them. And that brings us to the issue of how many tests
is enough.

>> The produced test suite will be
>> closer to the textual specification and you can adjust the scale of
>> your tests more easily (by which I mean how many test cases for each
>> spec you want to run). It's kind of like fuzzy testing, but with a bit
>> less fuzz. ;-)
>
> well but special attention must be given to coverage and regression
> testing . I found some gaps in there

I think coverage is a totally different issue. If you're concerned
about behavior of your code you shouldn't care about coverage. Once
you have a comprehensive test suite that describes all of the behavior
you care about, it's very much possible that you have full coverage.
You could remove test cases and still have 100% coverage - does that
mean those test cases are superfluous?

In other words, you shouldn't care about coverage once you have 100%. ;-)

>> It's a general statement about all tests: you can't use them to prove
>> your code works, no matter how many of them you have.
>
> Yes that's right. But when manual tests are written then there's a
> *good* judgment about whether test cases are useful & correct or not.
> OTOH if they are generated then you just can't say ...

QuickCheck doesn't generate test cases as much as it generates sample
data. You, as a programmer, are still responsible for:
 1. specifying properties of the code
 2. specifying characteristics of the test data
It's a simple idea, really, bringing your tests notation closer to one
encountered in specifications.

For me it looks like this: For each test case I could come up with
manually written values OR I could put in some more work to specify
functions that will generate those kind of values. For each
interesting corner-case value I could make a new generator or a branch
in an existing one. Now, I can generate hundreds of values very
easily, some of which may point me to new corner cases I haven't
anticipated. I've put some extra work and I got some value back. I
don't see how this process yields less trustworthy tests than the
standard one (where you hard code values).

BTW, nose has this nice feature called test generators
(http://somethingaboutorange.com/mrl/projects/nose/0.11.1/writing_tests.html#test-generators)
and you may use it for your QuickCheck-like tests just as well as
PeckCheck.

> ... since you don't actually control all the details of the generator
> then what if 83 % of your tests are performed with irrelevant or
> redundant data ?

No, you *do* control all important details of generation. It's the
tester that decides how values are built and what are probabilities of
building each possible combination. You can even hard-code a few
manually-prepared values in your test generator if that's gives you
warm fuzzies.

> That's why I mentioned CI environments, because they
> seem to be perfect to display and track reports about distribution of
> test data, created by the test generator (e.g. QuickCheck 's xxx |
> `collect` AFAICR function ).

Yes, collect and classify.

>> Still, IMHO
>> having 1000 diverse test cases auto-generated from a declarative spec
>> (and different set of 1000 each time!) is better than having a few
>> written manually.
>
> Certainly ... but IMHO only if test data is good enough.

The same applies to manually prepared test data.

>> I agree. Specs for things like XML, SQL or JSON are there, but having
>> executable specs would be even better. :-)
>> Seriously though, right now it sounds like a pipe dream
>
> why exactly ? (I'm a neophyte, can't see the light at the end of the
> tunnel :-/ )

Well, our flexible natural language is currently the best tool to
describe computer systems. Sometimes we can resort to mathematics,
sometimes a code snippet can be provided, but in most cases natural
language is the best we can hope for.

Moreover, there's no infrastructure, especially for things that don't
even exist. You may have a spec, but the system the spec describes
doesn't even exist - how you're gonna build testing infrastructure for
it? How you're gonna run your tests?

And if you're gonna say "we need a spec for those specs", well, think again. ;-)

Cheers,
mk