[twill] Working around <br/> issue?
Misha Koshelev
misha680 at gmail.com
Wed Dec 30 15:00:13 PST 2009
Thank you. Very helpful!!! I have used the following code:
go http://www.google.com/voice#trash
fv 1 Email email
fv 1 Passwd passwd
submit
run "b=get_browser();b.result.page=b.result.page.replace('<br/>','<br />');"
It works as I can confirm with a save_html. However, when doing a showforms I still get:
>> showforms
Traceback (most recent call last):
File "/usr/bin/twill-sh", line 20, in <module>
twill.shell.main()
File "/var/lib/python-support/python2.5/twill/shell.py", line 387, in main
shell.cmdloop(welcome_msg)
File "/usr/lib/python2.5/cmd.py", line 142, in cmdloop
stop = self.onecmd(line)
File "/usr/lib/python2.5/cmd.py", line 219, in onecmd
return func(arg)
File "/var/lib/python-support/python2.5/twill/shell.py", line 42, in do_cmd
print '\nERROR: %s\n' % (str(e),)
File "/usr/lib/python2.5/HTMLParser.py", line 59, in __str__
result = self.msg
AttributeError: 'ParseError' object has no attribute 'msg'
Any idea how I can figure out what's causing this? I'd like to avoid sending the HTML to a public
list, but will gladly send over personal (individual) email.
Thank you
Misha
Howard B. Golden wrote:
> On Wednesday December 30, 2009, Misha Koshelev wrote:
>
>> I obviously am not able to alter this website. However, I was
>> wondering is there either: (a) a way to query-replace all <br/> with
>> <br /> _before_ twill parses a website or, alternately,
>> (b) I can do this from a shell script fairly easily. Is there a
>> mechanism, analogous to save_html, to _input_ an HTML file into
>> twill, say like load_html?
>
> If you trace how twill's commands work, you will see that they use a
> browser.get_html() call to get the page's data. This is then used for
> all the rest of the processing. For example, look at the code for find()
> in commands.py.
>
> The browser object also has a "result" attribute. This is a
> ResultWrapper object (see utils.py) created by _journey() in browser.py
> when it reads a page. Inside this object, there is a "page" attribute,
> which you can play with (and modify).
>
> In summary, you can create a browser object (call it "b"). Then, after
> you do a "b.go(url)", you can modify "b.result.page" to be whatever you
> want before calling other functions.
>
> Note: All of this can be done using a "run" command. See my previous
> message (http://lists.idyll.org/pipermail/twill/2009-March/000962.html)
> for how this works.
>
> (I haven't tested this, so it may have some syntax errors. Let me know
> if you have any questions.)
>
> Howard
More information about the twill
mailing list