[twill] Working around <br/> issue?

Wed Dec 30 15:00:13 PST 2009

Thank you. Very helpful!!! I have used the following code:

go http://www.google.com/voice#trash
fv 1 Email email
fv 1 Passwd passwd
submit
run "b=get_browser();b.result.page=b.result.page.replace('<br/>','<br />');"

It works as I can confirm with a save_html. However, when doing a showforms I still get:
>> showforms
Traceback (most recent call last):
  File "/usr/bin/twill-sh", line 20, in <module>
    twill.shell.main()
  File "/var/lib/python-support/python2.5/twill/shell.py", line 387, in main
    shell.cmdloop(welcome_msg)
  File "/usr/lib/python2.5/cmd.py", line 142, in cmdloop
    stop = self.onecmd(line)
  File "/usr/lib/python2.5/cmd.py", line 219, in onecmd
    return func(arg)
  File "/var/lib/python-support/python2.5/twill/shell.py", line 42, in do_cmd
    print '\nERROR: %s\n' % (str(e),)
  File "/usr/lib/python2.5/HTMLParser.py", line 59, in __str__
    result = self.msg
AttributeError: 'ParseError' object has no attribute 'msg'

Any idea how I can figure out what's causing this? I'd like to avoid sending the HTML to a public
list, but will gladly send over personal (individual) email.

Thank you
Misha

Howard B. Golden wrote:
> On Wednesday December 30, 2009, Misha Koshelev wrote:
> 
>> I obviously am not able to alter this website. However, I was
>>  wondering is there either: (a) a way to query-replace all <br/> with
>>  <br /> _before_ twill parses a website or, alternately,
>> (b) I can do this from a shell script fairly easily. Is there a
>>  mechanism, analogous to save_html, to _input_ an HTML file into
>>  twill, say like load_html?
> 
> If you trace how twill's commands work, you will see that they use a 
> browser.get_html() call to get the page's data. This is then used for 
> all the rest of the processing. For example, look at the code for find() 
> in commands.py.
> 
> The browser object also has a "result" attribute. This is a 
> ResultWrapper object (see utils.py) created by _journey() in browser.py 
> when it reads a page. Inside this object, there is a "page" attribute, 
> which you can play with (and modify).
> 
> In summary, you can create a browser object (call it "b"). Then, after 
> you do a "b.go(url)", you can modify "b.result.page" to be whatever you 
> want before calling other functions.
> 
> Note: All of this can be done using a "run" command. See my previous 
> message (http://lists.idyll.org/pipermail/twill/2009-March/000962.html) 
> for how this works.
> 
> (I haven't tested this, so it may have some syntax errors. Let me know 
> if you have any questions.)
> 
> Howard