[twill] Character Parsing

Terry Peppers peppers at gmail.com
Wed Feb 22 18:00:18 PST 2006


I'm still working through the documentation on this one, but I found a
slightly odd case.

-----
#!/usr/bin/python

from twill import get_browser
from twill.commands import debug, go, follow, back, find, showforms

url = "http://swordstyle.com/test/index.php"

b = get_browser()

b.go(url)

showforms()
-----

This ends up throwing a unexpected response:

-----
Traceback (most recent call last):
  File "test_unicode.py", line 13, in ?
    showforms()
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/commands.py",
line 326, in showforms
    browser.showforms()
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/browser.py",
line 223, in showforms
    for n, f in enumerate(self._browser.forms()):
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/mechanize/_mechanize.py",
line 244, in forms
    return self._factory.forms()
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/utils.py",
line 307, in forms
    self._forms = parse_fn(response, self._encoding)
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/mechanize/_html.py",
line 218, in parse_response
    ignore_errors=self.ignore_errors
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 870, in ParseResponse
    encoding,
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 906, in ParseFile
    fp.feed(ch)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/sgmllib.py",
line 95, in feed
    self.goahead(0)
  File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/sgmllib.py",
line 184, in goahead
    self.handle_entityref(name)
  File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 667, in handle_entityref
    self.handle_data(table[fullname].encode(self._encoding))
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2122' in
position 0: ordinal not in range(256)
-----

Has anyone else run into this problem?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/twill/attachments/20060222/79baffc0/attachment.html


More information about the twill mailing list