[twill] Character Parsing
Terry Peppers
peppers at gmail.com
Wed Feb 22 18:00:18 PST 2006
I'm still working through the documentation on this one, but I found a
slightly odd case.
-----
#!/usr/bin/python
from twill import get_browser
from twill.commands import debug, go, follow, back, find, showforms
url = "http://swordstyle.com/test/index.php"
b = get_browser()
b.go(url)
showforms()
-----
This ends up throwing a unexpected response:
-----
Traceback (most recent call last):
File "test_unicode.py", line 13, in ?
showforms()
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/commands.py",
line 326, in showforms
browser.showforms()
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/browser.py",
line 223, in showforms
for n, f in enumerate(self._browser.forms()):
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/mechanize/_mechanize.py",
line 244, in forms
return self._factory.forms()
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/utils.py",
line 307, in forms
self._forms = parse_fn(response, self._encoding)
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/mechanize/_html.py",
line 218, in parse_response
ignore_errors=self.ignore_errors
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 870, in ParseResponse
encoding,
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 906, in ParseFile
fp.feed(ch)
File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/sgmllib.py",
line 95, in feed
self.goahead(0)
File
"/System/Library/Frameworks/Python.framework/Versions/2.3/lib/python2.3/sgmllib.py",
line 184, in goahead
self.handle_entityref(name)
File "/Library/Python/2.3/site-packages/twill-0.8.3-py2.3.egg/twill/other_packages/ClientForm.py",
line 667, in handle_entityref
self.handle_data(table[fullname].encode(self._encoding))
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u2122' in
position 0: ordinal not in range(256)
-----
Has anyone else run into this problem?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/twill/attachments/20060222/79baffc0/attachment.html
More information about the twill
mailing list