[twill] general query re form parsing.

John J Lee jjl at pobox.com
Tue Jan 24 11:40:47 PST 2006


On Tue, 24 Jan 2006, Titus Brown wrote:
[...]
> I could stick with the mechanize/ClientForm approach, which is to deal
> badly with outright errors that exist in the semantics of the page.
> ('tidy' will not fix this!)

That wasn't an official 'approach', just that I didn't (until recently) 
get around to doing anything about it.

It's probably a reasonable approach for functional testing for many 
people, though: generally one is happy to be told about one's own HTML 
bugs.


> I could switch to using BeautifulSoup if BS is installed, as well as
> continuing to do doing tidy preprocessing if tidy is installed.

There's some BeautifulSoup support in ClientForm and mechanize now.  Not 
well tested yet, and SVN HEAD is a little unstable ATM.


> I could also modify ClientForm to be tolerant to ParseErrors of the sort
> that Peri encounters.
>
> Right now I'm leaning towards including BS and modifying ClientForm.

If you write something and people use it in twill, I can probably merge it 
back in to ClientForm as a new parser class.  I was nervous about removing 
the ParseError raises (or similar work) myself, because I wasn't really 
using it, so I was quite confident of breaking it (I certainly made a mess 
of it when I implemented the ignore errors arg, which is why I removed it 
again).  If twill users use the modified code, then that's no longer a 
problem :-)


John



More information about the twill mailing list