[twill] Bug in FakeResponse
Jacob Hallén
jacob at openend.se
Mon Jun 25 08:33:14 PDT 2007
Hi,
I'm trying to use twill to examine forms in webpages and it fails all the
time. I have after a fairly long debugging session pinpointed the problem.
Twill uses mechanize to parse the page and find out the contents of the
various forms on the webpage (mechanize in turn uses Clientform).
The interesting peiece of code in mechanize looks like this:
<_html.py, RobustFactory>
...
def set_response(self, response):
import _beautifulsoup
Factory.set_response(self, response)
if response is not None:
data = response.read()
soup = self._soup_class(self.encoding, data)
self._forms_factory.set_response(
copy.copy(response), self.encoding)
self._links_factory.set_soup(
soup, response.geturl(), self.encoding)
self._title_factory.set_soup(soup, self.encoding)
The problem is with copy.copy(response), where rsponse is a FakeResponse
object from twill.
copy.copy() on a StringIO object will leave the cursor at the end of the copy,
and it is the copy that gets passed to Clientform.
Patching mechanize as follows works around the problem:
def set_response(self, response):
import _beautifulsoup
Factory.set_response(self, response)
if response is not None:
data = response.read()
soup = self._soup_class(self.encoding, data)
import pdb
pdb.set_trace()
x = copy.copy(response)
x.seek(0)
self._forms_factory.set_response(
x, self.encoding)
self._links_factory.set_soup(
soup, response.geturl(), self.encoding)
self._title_factory.set_soup(soup, self.encoding)
However, this is not very neat. I don't know what the proper solution is.
Finally, here is a program that tests the behaviour of StringIO in
FakeResponse:
from cStringIO import StringIO
class FakeResponse:
def __init__(self, data, url, info):
self.fp = StringIO(data)
self.url = url
self._info = info
def read(self, *args):
print args
x = self.fp.read(*args)
return x
def seek(self, *args):
return self.fp.seek(*args)
def info(self):
return self._info
def geturl(self):
return self.url
import copy
a = FakeResponse('abcdefghijklmnop', 'Some url', 'Some info')
b = copy.copy(a)
print 'a: ', a.read()
print 'b: ', b.read()
I should add that I really like twill, otherwise I would not have gone to the
length of debugging it.
Cheers
Jacob Hallén
More information about the twill
mailing list