[TIP] Nose and doctests in extension modules...

Thu Jun 19 17:04:02 PDT 2008

Howdy,

On Thu, Jun 19, 2008 at 2:30 PM, Kumar McMillan
<kumar.mcmillan at gmail.com> wrote:
> On Thu, Jun 19, 2008 at 3:09 PM, Fernando Perez <fperez.net at gmail.com> wrote:
>> Hi folks,
>>
>> I have a question that much googling did't provide answers for.  Feel
>> free to point me towards TFM if I missed something.
>>
>> My question is: can nose find/use doctests embedded in docstrings as
>> part of extension modules?  If it can, what am I doing wrong? If not,
>> could it be done?
>
> What you posted below is the behavior I'd expect.  You told nose to
> look in .so files, which are binary, so nose is trying to load
> primes.so and send it through doctest, which is failing because it's
> not text.  I'd suggest instead trying --doctest-extension=pyx so that
> it loads your Cython/Pyrex files into doctest.  See if that works.

It kind of works, though with 2 caveats:

1. I had to add at the beginning of the docstring this:

    >>> from primes import primes

Since a .pyx file is not itself executable, I had to put the import
statement manually.

2. For some reason I don't understand, the original run gave me this error:

Failed example:
    primes(10)
Expected:
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
    """
Got:
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

I don't understand why it failed to identify the end of the docstring
and grabbed the """ as part of the doctest.  I was able to work around
this by adding a blank line at the end of the docstring, but this
shouldn't be necessary, and I suspect that something is not being done
right in docstring identification.

But I'm still not happy with this solution, because the docstrings
_should_ be fetched from the .so file. It doesn't matter that it's
binary, all functions defined it can be introspected using the python
API for objects:

>>> import primes

>>> print primes.primes.__doc__
Return a list with the first kmax primes.

    Examples:

    >>> from primes import primes

    >>> primes(1)
    [2]

    >>> primes(10)
    [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Docstrings can be extracted in python from objects as their __doc__
attribute, even if said objects live inside a binary extension module.
 In fact, the pyx file is NOT where they should be searched, because
that file is not really valid python (it's pyrex) and I suspect it
will trip the parser in a more complicated example than what I posted.
 My example happened to be so trivial that parsing it worked OK (the
pyrex source was almost identical python), but real-world pyrex files
contain lots of non-python in them (that's the whole point).

So what I'd like to know is if nose can be taught (or if it knows
already and I'm doing something wrong) to correctly introspect objects
that live in extension modules for their docstrings, so it can fetch
doctests from either python sources or extension modules.

I really appreciate your help, I just think that the current situation
is still not satisfactory.  For context, all of this concerns NumPy
and SciPy: there's lots of interest in those projects in using more
doctests, and we recently switched to using nose for all numpy/scipy
testing.  But numpy/scipy have gobs of extension code, and adequate
support for nose finding doctests in extension code is really a huge
issue for us in the long run.

Thanks for any help,

f