[twill] html entities and latin-1 problem
titus at caltech.edu
Sun Mar 5 14:29:09 PST 2006
On Sun, Mar 05, 2006 at 11:04:57PM +0100, gabor wrote:
-> i'm using twill-0.8.3, and works fine, except the following problem:
-> 1. have the following html file : "›" (yes, that entity is the
-> whole file)
-> 2. start twill-sh, and go to the that page, and do a "showforms"
-> 3. you get the following error:
-> UnicodeEncodeError: 'latin-1' codec can't encode character u'\u203a' in
-> position 0: ordinal not in range(256)
-> (full stacktrace at the end of the mail)
-> it's logical that that characters is un-encode-able in latin-1, but
-> that's fine. but why is he trying to represent it in latin-1?
-> as a quick-fix, changing "latin-1" to "utf-8" in
-> twill/other_packages/mechanize/_html.py/form_parser_args helps,
-> but i don't think that's the cleanest solution..
-> any better ideas?
Short answer -- unicode support in mechanize is still young ;(.
I have one or two other unicode issues to look at today, too.
More information about the twill