[twill] Ignoring markup when finding text
Hancock, David (DHANCOCK)
DHANCOCK at arinc.com
Tue Jul 11 09:06:32 PDT 2006
I've used the html2text module before, and was grateful not to have to
worry about entity references and other niceties that I would have had
to dummy out by trial and error:
http://www.aaronsw.com/2002/html2text/
It may be overkill for simply getting the viewable text out of an HTML
document, but it'll work right out of the box.
Cheers!
--
David Hancock | dhancock at arinc.com | 410-266-4384
-----Original Message-----
From: twill-bounces at lists.idyll.org
[mailto:twill-bounces at lists.idyll.org] On Behalf Of Titus Brown
Sent: Tuesday, July 11, 2006 11:22 AM
To: Michael Hope
Cc: twill at lists.idyll.org
Subject: Re: [twill] Ignoring markup when finding text
On Tue, Jul 11, 2006 at 08:08:15AM +0000, Michael Hope wrote:
-> I've just started using twill to test a django based app, and was
wondering
-> how to handle basic text tests.
->
-> I'd like to assert on what the user sees, not the HTML. For example
-> the app has a 'Log in to post comments' sentance with a anchor around
the
-> words 'Log in'. The test will be more robust and easier to read if I
can
-> search for the plain text instead of the text with mark up.
->
-> I made a quick change to 'find' to search a version of the page with
all
-> tags and newlines stripped out and it worked well. I was thinking
about
-> making this an option just like the current regex options. It's a
bit messy
-> as you'd be mixing option classes but not too bad.
->
-> How do you people handle this?
Hi, Michael,
good question ;). I've been thinking about adding an option to 'show'
that strips all of the tags, e.g.
show --text
and this idea of yours fits pretty well. I'd worry about handling
things like newlines -- what if 'show --text' wraps the line "Log in\nto
post comments"? Any thoughts?
cheers,
--titus
_______________________________________________
twill mailing list
twill at lists.idyll.org
http://lists.idyll.org/listinfo/twill
More information about the twill
mailing list