[twill] Form parsing error - using Python's own HTMLParser.py and not BeautifulSoup/tidy?

Hotsyk gotsyk at gmail.com
Wed Jan 28 12:01:21 PST 2009


 I've got similar problem with nose+twill and I found solution in
 incorrect parsing of the <br/> tag (without space). I've changed them
 to <br /> (with space) and got no error.

 I've checked your page and found some <br/> there too. I'm not sure
 this is exactly your problem, but IMHO, you should try.

 Volodymyr Hotsyk

P.S. Resending, to determine: this is reply to this message:

>Hello to all on the mailing list from a new member.
>I am trying to use twill to automate the use of my mobile phone operator's SMS
>web portal, to allow me to send text messages from the command line of my
>laptop, using its nice, big keyboard, rather than the tiny, fiddly keypad of my
>Using twill, I can successfully log in and follow the link to the SMS-sending
>page, but then twill crashes when it attempts to parse the forms on that page.
>When it crashes, the error seems to be in Python's own HTMLParser.py script.
>That puzzles me, because I have BeautifulSoup and tidy installed, and can prove
>(I think) that they are both being used by the fact that no exceptions are
>raised when commands are issued after requiring them in the config. If these
>superior HTML-parsing modules are being used, why is Python's HTML parser being
>called all?
>twill has successfully parsed all other HTML pages (with forms) that I have
>thrown at it. There seems to be something particularly nasty about the HTML on
>this particular page (perhaps inserted deliberately by the mobile provider to
>prevent just this sort of automation). If twill simply can't handle it, then
>I'm happy to accept that. My concern is that there might be something wrong with
>my (pretty new) Python or twill installation, which is causing an avoidable
>exception to occur.
>Could anyone please suggest what is going wrong?
>(As the SMS-sending page is only accessible after logging in, for the purposes
>of illustration I have copied the HTML of that page and have saved it to a file
>on my own server. This copy still causes twill to crash in the same manner as
>when using the live version.)
>--- Start of text dump ---
>  -= Welcome to twill! =-
>current page:  *empty page*
>>> config require_tidy 1
>current page:  *empty page*
>>> config require_BeautifulSoup 1
>current page:  *empty page*
>>> config
>current configuration:
>        acknowledge_equiv_refresh : True
>        allow_parse_errors : True
>        readonly_controls_writeable : False
>        require_BeautifulSoup : True
>        require_tidy : True
>        use_BeautifulSoup : True
>        use_tidy : True
>        with_default_realm : False
>current page:  *empty page*
>>> go http://www.saytheword.org.uk/send-text-preparing.htm
>==> at http://www.saytheword.org.uk/send-text-preparing.htm
>current page: http://www.saytheword.org.uk/send-text-preparing.htm
>>> showlinks
>8>< - - - SNIP! I've cut this bit out to save space, but no exceptions
>are raised. - - - ><8
>>> showforms
>Traceback (most recent call last):
>  File "/usr/bin/twill-sh", line 8, in <module>
>    load_entry_point('twill==0.9', 'console_scripts', 'twill-sh')()
>  File "/usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/shell.py",
>line 383, in main
>    shell.cmdloop(welcome_msg)
>  File "/usr/lib/python2.5/cmd.py", line 142, in cmdloop
>    stop = self.onecmd(line)
>  File "/usr/lib/python2.5/cmd.py", line 219, in onecmd
>    return func(arg)
>  File "/usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/shell.py",
>line 42, in do_cmd
>    print '\nERROR: %s\n' % (str(e),)
>  File "/usr/lib/python2.5/HTMLParser.py", line 59, in __str__
>    result = self.msg
>AttributeError: 'ParseError' object has no attribute 'msg'
>--- End of text dump ---

More information about the twill mailing list