[twill] Can't login to MySpace

twill.overbored at spamgourmet.com twill.overbored at spamgourmet.com
Wed Aug 9 22:04:37 PDT 2006


So sorry - that was totally wrong.

In fact, both versions of mechanize work (I was wrong because of yet
more Python import confusion). I.e., using the latest darcs checkout
of twill, this works:

from twill.other_packages.mechanize import *
b=Browser()
b.open('http://myspace.com')
b.select_form('theForm')
b['email']='myname at myhost'
b['password']='mypass'
print b.submit().read()

Yet the following fails!

from twill.commands import *
go('http://myspace.com')

This is most bizarre, given that the twill simply invokes open(). I
made sure that there were no other packages - just cd to a directory
that isn't the twill darcs directory and then make sure that importing
BeautifulSoup, twill, ClientForm, and mechanize all fail. The
exception is:

/export/home/bob/tmp/work/twill2/twill/commands.py in go(url)
    102     Visit the URL given.
    103     """
--> 104     browser.go(url)
    105     return browser.get_url()
    106

/export/home/bob/tmp/work/twill2/twill/browser.py in go(self, url)
    112         for u in try_urls:
    113             try:
--> 114                 self._journey('open', u)
    115                 success = True
    116                 break

/export/home/bob/tmp/work/twill2/twill/browser.py in _journey(self,
func_name, *args, **kwargs)
    499         func = getattr(self._browser, func_name)
    500         try:
--> 501             r = func(*args, **kwargs)
    502         except urllib2.HTTPError, e:
    503             r = e

/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in open(self, url, data)
    128         if self._response is not None:
    129             self._response.close()
--> 130         return self._mech_open(url, data)
    131
    132     def _mech_open(self, url, data=None, update_history=True):

/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in _mech_open(self, url, data, update_history)
    168 ##             # acceptable.
    169 ##             raise
--> 170         self.set_response(response)
    171         if not success:
    172             raise error

/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in set_response(self, response)
    211
    212         self._response = response
--> 213         self._factory.set_response(self._response)
    214
    215     def geturl(self):


/export/home/bob/tmp/work/twill2/twill/utils.py in set_response(self, response)
    390         else:
    391             self.factory = self.basic_factory
--> 392         self._cleanup_html(response)
    393
    394     def links(self):

/export/home/bob/tmp/work/twill2/twill/utils.py in _cleanup_html(self, response)
    426
    427         self.factory.set_response(FakeResponse(self._html, self._url,
--> 428                                                response.info()))
    429
    430     def use_BS(self):

/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
in set_response(self, response)
    576         if response is not None:
    577             data = response.read()
--> 578             soup = self._soup_class(self.encoding, data)
    579             self._forms_factory.set_response(response, self.encoding)
    580             self._links_factory.set_soup(

/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
in __init__(self, encoding, text)
    319             self._encoding = encoding
    320             ### @CTB
--> 321             BeautifulSoup.BeautifulSoup.__init__(self, text)
    322
    323         def handle_charref(self, ref):

/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in __init__(self, *args, **kwargs)
   1324         if not kwargs.has_key('smartQuotesTo'):
   1325             kwargs['smartQuotesTo'] = self.HTML_ENTITIES
-> 1326         BeautifulStoneSoup.__init__(self, *args, **kwargs)
   1327
   1328     SELF_CLOSING_TAGS = buildTagMap(None,

/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in __init__(self, markup, parseOnlyThese, fromEncoding, markupMassage,
smartQuotesTo, convertEntities, selfClosingTags)
    971         self.markupMassage = markupMassage
    972         try:
--> 973             self._feed()
    974         except StopParsing:
    975             pass

/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in _feed(self, inDocumentEncoding)
    996         self.reset()
    997
--> 998         SGMLParser.feed(self, markup or "")
    999         SGMLParser.close(self)
   1000         # Close out any unfinished strings and close all the open tags.

/usr/lib/python2.4/sgmllib.py in feed(self, data)
     93
     94         self.rawdata = self.rawdata + data
---> 95         self.goahead(0)
     96
     97     def close(self):

/usr/lib/python2.4/sgmllib.py in goahead(self, end)
    127                         i = i+1
    128                         continue
--> 129                     k = self.parse_starttag(i)
    130                     if k < 0: break
    131                     i = k

/usr/lib/python2.4/sgmllib.py in parse_starttag(self, i)
    278             j = j+1
    279         self.__starttag_text = rawdata[start_pos:j]
--> 280         self.finish_starttag(tag, attrs)
    281         return j
    282

/usr/lib/python2.4/sgmllib.py in finish_starttag(self, tag, attrs)
    309                 method = getattr(self, 'do_' + tag)
    310             except AttributeError:
--> 311                 self.unknown_starttag(tag, attrs)
    312                 return -1
    313             else:

/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in unknown_starttag(self, name, attrs, selfClosing)
   1153             self.currentData.append('<%s%s>' % (name, attrs))
   1154             return
-> 1155         self.endData()
   1156
   1157         if not self.isSelfClosingTag(name) and not selfClosing:

/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in endData(self, containerClass)
   1055     def endData(self, containerClass=NavigableString):
   1056         if self.currentData:
-> 1057             currentData = ''.join(self.currentData)
   1058             if currentData.endswith('<') and self.convertHTMLEntities:
   1059                 currentData = currentData[:-1] + '&lt;'

I spent a good couple of hours trying to find out what's going on but
I give up. So close! Details: the former script never even executes
endData(), while the second does and finds 0xc2 in
self.currentData[-1][0]. Anybody know what's up?

On 8/9/06, I wrote:
> OK, the problem was an outdated version of mechanize packaged with
> twill. I tested both the mechanize from its own subversion repository
> and the one included in the latest (darcs) version of twill, and only
> the latter worked.
>
> Titus, do you plan to update the components used by twill (in
> particular, the mechanize library, so that this problem is fixed)?
> Thanks.
>
> On 8/8/06, John J Lee - jjl at pobox.com
> <> wrote:
> > On Tue, 8 Aug 2006, twill.overbored at spamgourmet.com wrote:
> >
> > > Would you mind sharing the code snippet to make this happen? What
> > > version of mechanize did you use? (The one included with twill?)
> > > Thanks.
> > [...]
> >
> > I didn't do anything special, and used mechanize SVN with Python 2.5.
> >
> > I won't post/email the ten line script: I don't want to encourage scraping
> > against terms of use (it seems the no-scrape clause is rather a legal
> > reflex action these days).  Simply logging in doesn't appear to be against
> > their terms, but I assume in your case logging in would be a prelude to
> > some automated action, which they do prohibit.
> >
> >
> > John
> >
> >
>



More information about the twill mailing list