[twill] Working around <br/> issue?
Howard B. Golden
hgolden at socal.rr.com
Wed Dec 30 14:15:42 PST 2009
On Wednesday December 30, 2009, Misha Koshelev wrote:
> I obviously am not able to alter this website. However, I was
> wondering is there either: (a) a way to query-replace all <br/> with
> <br /> _before_ twill parses a website or, alternately,
> (b) I can do this from a shell script fairly easily. Is there a
> mechanism, analogous to save_html, to _input_ an HTML file into
> twill, say like load_html?
If you trace how twill's commands work, you will see that they use a
browser.get_html() call to get the page's data. This is then used for
all the rest of the processing. For example, look at the code for find()
in commands.py.
The browser object also has a "result" attribute. This is a
ResultWrapper object (see utils.py) created by _journey() in browser.py
when it reads a page. Inside this object, there is a "page" attribute,
which you can play with (and modify).
In summary, you can create a browser object (call it "b"). Then, after
you do a "b.go(url)", you can modify "b.result.page" to be whatever you
want before calling other functions.
Note: All of this can be done using a "run" command. See my previous
message (http://lists.idyll.org/pipermail/twill/2009-March/000962.html)
for how this works.
(I haven't tested this, so it may have some syntax errors. Let me know
if you have any questions.)
Howard
More information about the twill
mailing list