[TIP] Functional Testing of Desktop Applications
fuzzyman at voidspace.org.uk
Mon Mar 5 14:51:42 PST 2007
Nate Lowrie wrote:
>> Well, our project is a lot smaller than PyPy, but still around 20k
>> production lines and growing.
>> We have 2300 tests (unit and functional), and they take one and a half
>> hours to run. Over 2/3 of this time is functional tests.
> Wow.....That is alot of time for only 2300 tests. Then again,
> functional tests might be the cause of that. It sounds to me like you
> have several Slow Test smells in your code. I would look at your
> tests, identify the ones taking the longest, and refactor or mock them
> to speed it up.
The time is between 2/3 to 3/4 on the functional tests. (And I
exaggerated slightly - the current build time is about an hour and ten
We don't mock as much as we could, so we could probably improve the
speed of our unit tests.
For the functional tests we don't mock *anything* (well... except the
blocking modal system dialogs). A lot of these tests hit the filesystem,
the network and databases. We want our functional tests to test *actual*
behaviour, not behaviour in the presence of mocks.
Each functional test launches the application, so we have some startup
time overhead per test.
Some of these tests involve checking behaviour when the network is
disconnected, attempting to connect to non-existent servers etc - so
there are delays waiting for timeouts.
We also have functional performance tests which run processing runs ten
times and ensure there aren't too many outlying times beyond the expected.
All these take time. As I said, we're now moving to only running
unittests before checking in.
> I was working on a C# project not long ago and we had over 6000 unit
> tests that were compiled and run every 45 minutes by CruiseControl.
> Yes I know it's not the same as python but with python there is also
> no need to take 5 minutes to compile.
We have a continuous integration server that runs our full test suite
continuously, checking out the latest version from subversion. We
started with cruise control but :
* It breaks if your test failure messages output XML !! (breaks their
* It breaks if you pull down and restore the network like our functional
We eventually replicated all the features of CC.NET that we were
actually using in 120 lines of CPython code. (Which is why I'm reluctant
to consider switching to buildbot - we have no motivation.)
>> We are just switching over to say that we only need to run unit tests
>> before checking in.
> If you run a layered architecture you should be able to get away with
> only running unit tests for that layer before checking in. Then the
> Continuous Integration server can run the suite to check the rest.
> Your only worry in this is that the functional tests that go cross
> layer might fail.
At the moment it is not too onerous to run our full unit test suite.
It still leaves us with problems if we have to revert a checkin which
other developer pairs have already merged into their ongoing work.
>> What I'm saying certainly seems to be 'very extreme' for most Pythoneers
>> that I've talked to (and those on the list here) - but it really is
>> pretty standard XP, and the sort of stuff being explored by the agile
>> crowd. I have to say that it works *very* well for us, and provides a
>> great environment to develop in.
> I would agree. My biggest argument for blanket mock coverage is the
> retesting of functions and methods goes away. Suppose you have
> function A which calls function B to perform X functionality. If you
> want to test that function A can do X, you either give it input that
> leads it into and out of B and check states or you mock. Since B
> already has unit tests this is a waste of time, especially if B calls
> C and C calls D etc.... If you mock B and assume that the
> functionality will be verifying by it's unit tests, you are not
> testing code twice and there is a speed up of the process (varies).
> It may be extreme, but I think that it avoid duplicated test smells
> and is necessary to also avoid Slow Test smells.
Well, I agree with you.
It may well be extreme, but it is fairly mainstream eXtreme... :-)
More information about the testing-in-python