[TIP] Thoughts on Fuzz Testing

Fri Oct 23 00:35:26 PDT 2015

Hellos,

I have a few points to make, and a couple of questions:

   1. Complete correctness isn't needed for all types of software
   development.
   2. Runtime of tests, and supplying asynchronous results matter.
   3. Can fuzz testing use analysis of types to generate the preconditions?

1) Complete correctness of a solution is not always needed. Especially when
prototyping, in artistic practice, and with many other types of software
development or experimentation. If you can get a prototype running, then it
can make errors in specification and requirements more apparent quicker
than fully testing a solution which may have errors in specification.

Anything that gets in the way of iteration speed can be harmful to the
outcomes (for some types of software development). Automatically testing
your code is a really nice thing though, and I think can even help provide
better results in the cases I mentioned above.

For really low effort testing, I think Fuzz testing can be used
successfully. For development which needs to be very fast, but also fairly
correct. Defining invariants which are more general is better practice, and
says more about a program than not. (e.g. writing "all additions are
commutative" is better than writing "1 + 2 == 2 + 1". It says more.

Just like how some people say doc tests (ipython note books) are for lazy
developers, I see a similar misconception about Fuzz tests. Both provide
different types of value for different amounts of resources. One accusation
against writing doc tests is that they do not test many corner cases, and
if you do the tests make the documentation unreadable. With Fuzz testing
(with eg. Hypothesis) you can state your invariants in a small amount of
space and their intention is quite clear. [[*this is python, we should be
optimizing for reading our code right?!?*]]

2) The runtime of tests definitely impacts how the tests are used. If the
tests can run in under the time it takes to save a file, and write a commit
message then you won't impact the experience of the developer in any bad
ways. If you do it asynchronously, without distraction, and you don't block
up the development pipeline, then all the better.

By making the tests fast to run, and easy to use, Fuzz testing will become
more useful for these styles/stages of development. If it takes an extra
hour or two before test results are returned because fuzzing takes so long
to run, then keeping Fuzzing tests as a separate job can help. Don't bundle
the Fuzz tests up with your fast running unit tests. At one place we put
our functional tests that took over 30 minutes to run into a separate test
job. So then the quick running tests would return earlier (in less than a
minute after commit).

In reality developers are not going to wait 30+ minutes for their tests to
run before pushing code so they can collaborate. Running those tests in CI
though does catch errors. So run the slow tests on CI, and not by default
locally.

I know some shops have commit hooks which run ALL tests before the commit
is allowed. In these places adding 30 minutes of tests won't make your work
mates happy!

Ok... question time...

3) Can static type analysis be used for automated fuzz testing?

(as collected by inference, or collected by traces when running unit tests
etc).

In this first example from hypothesis (
http://hypothesis.readthedocs.org/en/latest/quickstart.html ), although the
inverse operation invariant of encode(str)/decode(list) may not be deduced
with static type analysis, it could definitely find out that the inputs and
outputs are strings.

So for the first bug mentioned (where an empty string causes an exception
in encode(str)), the automatically trying to stuff strings into the
function would have found the error.

Likewise, static analysis could find that the input type for decode(list)
is [[str, int]]

It seems finding invariants in existing tests would also be possible by
inspecting existing assertions, and the types used in those assertions.

Say this assertion test already existed.
our_input = 'WBWBBBW'
encoded = encode(our_input)
assert our_input == decode(encoded)

It could trying '*parametrising*' decode with strings, because "our_input"
tells us the input is a string.

Now if the inverse property was not a real invariant, and the test provides
a failing case, then you have also provided a result. The result is that
the invariant does not exist. This case would require feedback from the
programmer, as the test would be incorrect. But at least now the invariant
would be clear. We found an inconsistency in the specification... ya!

This commutative assertion on addition:

def test_ints_are_commutative():
    assert 1 + 2 == 2 + 1

could be transformed into this:

@given(st.integers(), st.integers())def test_ints_are_commutative(x, y):
    assert x + y == y + x

By finding the types in the original expression, and '*parametrising*' them.

But I guess in many cases there would be lots of false positives, and
perhaps this wouldn't be worth doing. But maybe it will be a case of the
developer just redefining the model until all the ambiguities are gone, and
all the assumptions are listed.

But perhaps there are some cases which can be completely automated with the
help of type analysis that don't give false positives? Or perhaps there are
Fuzzing tools which already do this?

cheerio,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/testing-in-python/attachments/20151023/ee6feeed/attachment.htm>