[TIP] why you should distribute tests with your application / module

Wed Sep 17 19:04:43 PDT 2008

On Sep 17, 2008, at 3:32 PM, Jesse Noller wrote:
>>> alternatively, you could skip the tests that require the large
>>> datasets when the datasets are not present.  This way the end user  
>>> can
>>> still run unit tests to determine to some degree whether or not the
>>> installed binary of your package is "working."  In other words, some
>>> tests are better than no tests.
>>
>> Yeah, this is pretty much what I had in mind / suggested earlier.
>> Seems like this would make a decent nose plugin (--download-test-
>> data)... not that I'm likely to write it. ;-)
>>
>> Thanks all.
>>
>> --Pete
>
> I/we could hack it out exceedingly quickly - what do you see it doing,
> accepting in a URL and unzipping it? Do you want it to be a command
> line argument (i.e: url) or an artifact in the nose config file?

Ideal solution:

The plugin provides a external_data(url, unpack=True) function, which  
returns a path on the local disk where the test data lives.  The  
plugin adds a --download-external-data option to control whether the  
data is retrieved or not.  If the data isn't available and that option  
is not specified, external_data raises SkipTest.

The user would need to specify a directory for storing the external  
data in, probably in their .noserc.  Using httplib2 for the d/l'ding  
might make the file management trivial, as it supports on disk caching  
already. Otherwise, you'd need to track what files correspond to which  
URLs.  Not hard, but if httplib2 can do it for you, it's laziness ftw.

Names & functionality subject to change.  Unpacking should probably  
support .gz, .tar.gz & .zip (all pretty straightforward w/ stdlib).

I originally thought of less ambitious solutions - just specifying a  
list of URLs in the config file/command line, say.  But doing so  
doesn't seem to add much beyond including a simple downloading script  
in the package itself.

Thanks!  Lemme know how I can help.

--Pete