[TIP] branch coverage

Thu Jan 24 13:56:07 PST 2008

Andrew Dalke wrote:
> Hi all,
>
> m h sesquile at gmail.com:
>   
>>  I really don't want to write a python compiler, but
>> am assuming tracing at that level won't be supported in cpython, but
>> perhaps might in pypy....
>>     
>
>
> On Jan 24, 2008, at 12:32 PM, Laura Creighton wrote:
>   
>> I think that Andrew Dalke, who is cc'd to this note, has already done
>> a bunch of work which would be very useful to you.
>>     
>
> I was talking with Laura a few days ago about a lightning talk I  
> wanted to present at PyCon - branch coverage.  There's no easy way to  
> do it with Python.  The closest is to use the compiler module to  
> generate the AST then instrument the AST.  The problem is, the  
> compiler module is a bear to work with and doesn't record everything  
> I want.  For example, in the coverage report I want to see which  
> branches weren't covered, pinpointed to the character range of the  
> expression.  Python's AST doesn't have byte positions.
>
> What I've been doing over the last couple of days is getting a PLY  
> grammar for Python.  As of last night it parses (and builds the  
> trivial concrete syntax tree) of the entire standard library.  After  
> the AST works I plan to convert code like this (from line 548 of  
> subprocess.py)
>
>   

A PLY grammar for Python is *extremely* interested - for entirely 
unrelated reasons. :-)

Michael

>      if close_fds and (stdin is not None or stdout is not None or
>                        stderr is not None):
>          raise ValueError("close_fds is not supported on Windows "
>                           "platforms if you redirect stdin/stdout/ 
> stderr")
>
> into something equivalent to this mess
>
>    __reached_statement(100)  # assuming this is statement number 100
>
>    if close_fds:
>      __branch_is_true(1)  # each branch also gets a unique id, with
>                           # some table mapping that to source file  
> and byte range
>
>      if stdin is not None:
>        __branch_is_true(2)
>        __result_bool = True
>        __result_obj = stdin
>      else:
>        __branch_is_false(2)
>
>        if stdout is not None:
>          __branch_is_true(3)
>          __result_bool = True
>        __result_obj = stdout
>        else:
>          __branch_is_false(3)
>
>          if stderr is not None:
>            __branch_is_true(4)
>            __result_bool = True
>            __result_obj = stderr
>          else:
>            __branch_is_false(4)
>            __result_bool = False
>            __result_obj = stderr
>    else:
>      __branch_is_false(1)
>      __result_bool = False
>      __result_obj = close_fds
>
>   if __result_bool:
>          __reached_statement(101)
>          raise ValueError("close_fds is not supported on Windows "
>                           "platforms if you redirect stdin/stdout/ 
> stderr")
>
> where __branch_is_true and __branch_is_false and __reached_statement  
> keep track of which branch points and lines were executed.  (Along  
> with a filename, module name, and md5 checksum to prevent version skew.)
>
>
> This horrible if statement expansion mess is needed because it's the  
> only way to keep Python guarantees:
>
>    -- short circuiting
>
>    -- the bool check is only done once per term
>
> Otherwise something like
>
>     if _bool_check(1, close_fds) and (_bool_check(2, stdin is not  
> None) ... )
>
> would work, where
>
>    def _bool_check(branch_number, obj):
>      if obj:
>        __branch_is_true(branch_number)
>      else:
>        __branch_is_false(branch_number)
>      return obj
>
> This is simple, but it calls bool(obj) twice.
>
>
>    -- the correct object is returned
>
> I had another hack which looked like
>
>      if (_about_to_call(1) and close_fds or is_false(1)) and  (...:
>
> along with some state tracking.  This handles the booleanness  
> correctly, but does not return the correct object during assignment
>
>    x = a and (b or c or d)
>
>
> Once I have the modified AST, what should I do with it?
>
> I could generate raw Python code, which would be ugly, have the  
> comments stripped out, and line numbers changed.  Or I could generate  
> byte code.
>
> If the latter, I was thinking to write a .py -> .pyc compiler, but do  
> I use it like compileall?  Or do I generate the .pyc files in another  
> directory, which is used for the coverage testing.  Where do I keep  
> the coverage results?  Probably all in a single directly, named after  
> the Python module name.
>
> Do people only care about if the branch was true/false or are the  
> number of tests also important?  What about the number of times a  
> line was executed, vs. a flag saying that it was covered?
>
>
>
>
> 				Andrew
> 				dalke at dalkescientific.com
>
>
>
> _______________________________________________
> testing-in-python mailing list
> testing-in-python at lists.idyll.org
> http://lists.idyll.org/listinfo/testing-in-python
>
>