[twill] Bug in FakeResponse

Jacob Hallén jacob at openend.se
Wed Jun 27 03:03:54 PDT 2007


tisdag 26 juni 2007 22:17 skrev John J Lee:
> On Mon, 25 Jun 2007, Jacob Hallén wrote:
> > The problem is with copy.copy(response), where rsponse is a FakeResponse
> > object from twill.
>
> Titus: not sure what you're using this FakeResponse for, but I strongly
> advise against its continuing to exist.  The mechanize code here is
> horrible and obscure bugs lurk for anybody who tries to use a
> response-like object that is not one of the objects returned by mechanize
> itself.  Does mechanize.make_response do what you need?
>
> (Of course I want to fix this in mechanize, but after the forever-receding
> stable release.  Dare I say that the next release looks like it will be
> marked stable?  no, better not...)
>
>
> John

As it turns out, there is a bug in mechanize as well.

My full fix for the problem is as follows:

1. Re-implement FakeResponse:

class FakeResponse:
    def __init__(self, data, url, info):
        self.data = data
        self.seekpos = 0
        self.url = url
        self._info = info

    def read(self, *args):
        if not len(args):
            x = self.data[self.seekpos:]
            self.seekpos = len(self.data)
            return x
        elif args[0] < 0:
            return ''

        x = self.data[self.seekpos : self.seekpos + args[0]]
        self.seekpos = self.seekpos + args[0]
        if self.seekpos > len(self.data):
            self.seekpos = len(self.data)
        return x

    def seek(self, offset, whence=0):
        if whence == 1:
            self.seekpos += offset
        elif whence == 2:
            self.seekpos = len(self.data) + offset
        else:
            self.seekpos = offset

        if self.seekpos < 0:
            self.seekpos = 0
        elif self.seekpos > len(self.data):
            self.seekpos = len(self.data)
            
    def info(self):
        return self._info

    def geturl(self):
        return self.url

2. Modify the set_response method of the class RobustFactory:

    def set_response(self, response):
        import _beautifulsoup
        Factory.set_response(self, response)
        if response is not None:
            self._forms_factory.set_response(
                copy.copy(response), self.encoding)
            soup = self._soup_class(self.encoding, response.read())
            self._links_factory.set_soup(
                soup, response.geturl(), self.encoding)
            self._title_factory.set_soup(soup, self.encoding)

You have to make the copy before you do the response.read(), otherwise you 
have an exhausted stream in the copy you make.

Apparently analyzing forms worked at some point in time. I wonder how it could 
be broken in two different places in two different packages. I think some 
tests would be a good idea.

Jacob Hallén



More information about the twill mailing list