[twill] Bug in FakeResponse

Jacob Hallén jacob at openend.se
Mon Jun 25 08:33:14 PDT 2007


Hi,

I'm trying to use twill to examine forms in webpages and it fails all the 
time. I have after a fairly long debugging session pinpointed the problem.

Twill uses mechanize to parse the page and find out the contents of the 
various forms on the webpage (mechanize in turn uses Clientform).

The interesting peiece of code in mechanize looks like this:

<_html.py, RobustFactory>
...
    def set_response(self, response):
        import _beautifulsoup
        Factory.set_response(self, response)
        if response is not None:
            data = response.read()
            soup = self._soup_class(self.encoding, data)
            self._forms_factory.set_response(
                copy.copy(response), self.encoding)
            self._links_factory.set_soup(
                soup, response.geturl(), self.encoding)
            self._title_factory.set_soup(soup, self.encoding)

The problem is with copy.copy(response), where rsponse is a FakeResponse 
object from twill.

copy.copy() on a StringIO object will leave the cursor at the end of the copy, 
and it is the copy that gets passed to Clientform.

Patching mechanize as follows works around the problem:

    def set_response(self, response):
        import _beautifulsoup
        Factory.set_response(self, response)
        if response is not None:
            data = response.read()
            soup = self._soup_class(self.encoding, data)
            import pdb
            pdb.set_trace()
            x = copy.copy(response)
            x.seek(0)
            self._forms_factory.set_response(
                x, self.encoding)
            self._links_factory.set_soup(
                soup, response.geturl(), self.encoding)
            self._title_factory.set_soup(soup, self.encoding)

However, this is not very neat. I don't know what the proper solution is.

Finally, here is a program that tests the behaviour of StringIO in 
FakeResponse:

from cStringIO import StringIO

class FakeResponse:
    def __init__(self, data, url, info):
        self.fp = StringIO(data)
        self.url = url
        self._info = info

    def read(self, *args):
        print args
        x = self.fp.read(*args)
        return x

    def seek(self, *args):
        return self.fp.seek(*args)

    def info(self):
        return self._info

    def geturl(self):
        return self.url

import copy

a = FakeResponse('abcdefghijklmnop', 'Some url', 'Some info')

b = copy.copy(a)

print 'a: ', a.read()
print 'b: ', b.read()


I should add that I really like twill, otherwise I would not have gone to the 
length of debugging it.

Cheers

Jacob Hallén



More information about the twill mailing list