[twill] Using Beautiful Soup to Find Images

Terry Peppers peppers at gmail.com
Thu Jul 13 05:20:53 PDT 2006


Yep.

That was it. Good to know.

Thanks William and Titus.

t.

On 7/13/06, Titus Brown <titus at caltech.edu> wrote:
> On Wed, Jul 12, 2006 at 02:23:16PM -0500, Terry Peppers wrote:
> -> Had a question for the group related to Beautiful Soup that is
> -> packaged with Twill.
> ->
> -> I'm trying to get away from using a regex to pull out all of the
> -> images in a HTML page, I figured I would use Beautiful Soup since it's
> -> included with Twill and it's made for parsing HTML, but I'm having
> -> some seriously weird results.
>
> [ ... ]
>
> -> So I'm not sure if Twill comes with a scaled back version of
> -> BeautifulSoup or if I'm just approaching the problem incorrectly. (If
> -> I were a productive member of the OS community I would offer Titus a
> -> patch that would just pull all the images in....).
> ->
> -> Anyone?
>
> I bet William is right -- that you're using BS 3.0 terminology with BS
> 2.0.
>
> I can do the following:
>
> soup('img')
>
> to get all of the image tags, for example.
>
> I'm not familiar enough with BS 3.0 to figure out what the difference is
> between findAll and __call__ in BS 2.0, though.
>
> cheers,
> --titus
>



More information about the twill mailing list