[twill] Bug in FakeResponse
Jacob Hallén
jacob at openend.se
Wed Jun 27 03:03:54 PDT 2007
tisdag 26 juni 2007 22:17 skrev John J Lee:
> On Mon, 25 Jun 2007, Jacob Hallén wrote:
> > The problem is with copy.copy(response), where rsponse is a FakeResponse
> > object from twill.
>
> Titus: not sure what you're using this FakeResponse for, but I strongly
> advise against its continuing to exist. The mechanize code here is
> horrible and obscure bugs lurk for anybody who tries to use a
> response-like object that is not one of the objects returned by mechanize
> itself. Does mechanize.make_response do what you need?
>
> (Of course I want to fix this in mechanize, but after the forever-receding
> stable release. Dare I say that the next release looks like it will be
> marked stable? no, better not...)
>
>
> John
As it turns out, there is a bug in mechanize as well.
My full fix for the problem is as follows:
1. Re-implement FakeResponse:
class FakeResponse:
def __init__(self, data, url, info):
self.data = data
self.seekpos = 0
self.url = url
self._info = info
def read(self, *args):
if not len(args):
x = self.data[self.seekpos:]
self.seekpos = len(self.data)
return x
elif args[0] < 0:
return ''
x = self.data[self.seekpos : self.seekpos + args[0]]
self.seekpos = self.seekpos + args[0]
if self.seekpos > len(self.data):
self.seekpos = len(self.data)
return x
def seek(self, offset, whence=0):
if whence == 1:
self.seekpos += offset
elif whence == 2:
self.seekpos = len(self.data) + offset
else:
self.seekpos = offset
if self.seekpos < 0:
self.seekpos = 0
elif self.seekpos > len(self.data):
self.seekpos = len(self.data)
def info(self):
return self._info
def geturl(self):
return self.url
2. Modify the set_response method of the class RobustFactory:
def set_response(self, response):
import _beautifulsoup
Factory.set_response(self, response)
if response is not None:
self._forms_factory.set_response(
copy.copy(response), self.encoding)
soup = self._soup_class(self.encoding, response.read())
self._links_factory.set_soup(
soup, response.geturl(), self.encoding)
self._title_factory.set_soup(soup, self.encoding)
You have to make the copy before you do the response.read(), otherwise you
have an exhausted stream in the copy you make.
Apparently analyzing forms worked at some point in time. I wonder how it could
be broken in two different places in two different packages. I think some
tests would be a good idea.
Jacob Hallén
More information about the twill
mailing list