[twill] Can't login to MySpace
twill.overbored at spamgourmet.com
twill.overbored at spamgourmet.com
Wed Aug 9 23:32:59 PDT 2006
Here's a simple function tracer along with the traces. A ton of more
code is executed by the working version even before the first open(),
but why?
On 8/10/06, I wrote:
> So sorry - that was totally wrong.
>
> In fact, both versions of mechanize work (I was wrong because of yet
> more Python import confusion). I.e., using the latest darcs checkout
> of twill, this works:
>
> from twill.other_packages.mechanize import *
> b=Browser()
> b.open('http://myspace.com')
> b.select_form('theForm')
> b['email']='myname at myhost'
> b['password']='mypass'
> print b.submit().read()
>
> Yet the following fails!
>
> from twill.commands import *
> go('http://myspace.com')
>
> This is most bizarre, given that the twill simply invokes open(). I
> made sure that there were no other packages - just cd to a directory
> that isn't the twill darcs directory and then make sure that importing
> BeautifulSoup, twill, ClientForm, and mechanize all fail. The
> exception is:
>
> /export/home/bob/tmp/work/twill2/twill/commands.py in go(url)
> 102 Visit the URL given.
> 103 """
> --> 104 browser.go(url)
> 105 return browser.get_url()
> 106
>
> /export/home/bob/tmp/work/twill2/twill/browser.py in go(self, url)
> 112 for u in try_urls:
> 113 try:
> --> 114 self._journey('open', u)
> 115 success > 116 break
>
> /export/home/bob/tmp/work/twill2/twill/browser.py in _journey(self,
> func_name, *args, **kwargs)
> 499 func > 500 try:
> --> 501 r > 502 except urllib2.HTTPError, e:
> 503 r >
> /export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
> in open(self, url, data)
> 128 if self._response is not None:
> 129 self._response.close()
> --> 130 return self._mech_open(url, data)
> 131
> 132 def _mech_open(self, url, data=None, update_history=True):
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
> in _mech_open(self, url, data, update_history)
> 168 ## # acceptable.
> 169 ## raise
> --> 170 self.set_response(response)
> 171 if not success:
> 172 raise error
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
> in set_response(self, response)
> 211
> 212 self._response > --> 213 self._factory.set_response(self._response)
> 214
> 215 def geturl(self):
>
>
> /export/home/bob/tmp/work/twill2/twill/utils.py in set_response(self, response)
> 390 else:
> 391 self.factory > --> 392 self._cleanup_html(response)
> 393
> 394 def links(self):
>
> /export/home/bob/tmp/work/twill2/twill/utils.py in _cleanup_html(self, response)
> 426
> 427 self.factory.set_response(FakeResponse(self._html, self._url,
> --> 428 response.info()))
> 429
> 430 def use_BS(self):
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
> in set_response(self, response)
> 576 if response is not None:
> 577 data > --> 578 soup > 579 self._forms_factory.set_response(response, self.encoding)
> 580 self._links_factory.set_soup(
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
> in __init__(self, encoding, text)
> 319 self._encoding > 320 ### @CTB
> --> 321 BeautifulSoup.BeautifulSoup.__init__(self, text)
> 322
> 323 def handle_charref(self, ref):
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
> in __init__(self, *args, **kwargs)
> 1324 if not kwargs.has_key('smartQuotesTo'):
> 1325 kwargs['smartQuotesTo'] > -> 1326 BeautifulStoneSoup.__init__(self, *args, **kwargs)
> 1327
> 1328 SELF_CLOSING_TAGS >
> /export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
> in __init__(self, markup, parseOnlyThese, fromEncoding, markupMassage,
> smartQuotesTo, convertEntities, selfClosingTags)
> 971 self.markupMassage > 972 try:
> --> 973 self._feed()
> 974 except StopParsing:
> 975 pass
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
> in _feed(self, inDocumentEncoding)
> 996 self.reset()
> 997
> --> 998 SGMLParser.feed(self, markup or "")
> 999 SGMLParser.close(self)
> 1000 # Close out any unfinished strings and close all the open tags.
>
> /usr/lib/python2.4/sgmllib.py in feed(self, data)
> 93
> 94 self.rawdata > ---> 95 self.goahead(0)
> 96
> 97 def close(self):
>
> /usr/lib/python2.4/sgmllib.py in goahead(self, end)
> 127 i > 128 continue
> --> 129 k > 130 if k < 0: break
> 131 i >
> /usr/lib/python2.4/sgmllib.py in parse_starttag(self, i)
> 278 j > 279 self.__starttag_text > --> 280 self.finish_starttag(tag, attrs)
> 281 return j
> 282
>
> /usr/lib/python2.4/sgmllib.py in finish_starttag(self, tag, attrs)
> 309 method > 310 except AttributeError:
> --> 311 self.unknown_starttag(tag, attrs)
> 312 return -1
> 313 else:
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
> in unknown_starttag(self, name, attrs, selfClosing)
> 1153 self.currentData.append('<%s%s>' % (name, attrs))
> 1154 return
> -> 1155 self.endData()
> 1156
> 1157 if not self.isSelfClosingTag(name) and not selfClosing:
>
> /export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
> in endData(self, containerClass)
> 1055 def endData(self, containerClass=NavigableString):
> 1056 if self.currentData:
> -> 1057 currentData > 1058 if currentData.endswith('<') and self.convertHTMLEntities:
> 1059 currentData >
> I spent a good couple of hours trying to find out what's going on but
> I give up. So close! Details: the former script never even executes
> endData(), while the second does and finds 0xc2 in
> self.currentData[-1][0]. Anybody know what's up?
>
> On 8/9/06, I wrote:
> > OK, the problem was an outdated version of mechanize packaged with
> > twill. I tested both the mechanize from its own subversion repository
> > and the one included in the latest (darcs) version of twill, and only
> > the latter worked.
> >
> > Titus, do you plan to update the components used by twill (in
> > particular, the mechanize library, so that this problem is fixed)?
> > Thanks.
> >
> > On 8/8/06, John J Lee - jjl at pobox.com
> > <> wrote:
> > > On Tue, 8 Aug 2006, twill.overbored at spamgourmet.com wrote:
> > >
> > > > Would you mind sharing the code snippet to make this happen? What
> > > > version of mechanize did you use? (The one included with twill?)
> > > > Thanks.
> > > [...]
> > >
> > > I didn't do anything special, and used mechanize SVN with Python 2.5.
> > >
> > > I won't post/email the ten line script: I don't want to encourage scraping
> > > against terms of use (it seems the no-scrape clause is rather a legal
> > > reflex action these days). Simply logging in doesn't appear to be against
> > > their terms, but I assume in your case logging in would be a prelude to
> > > some automated action, which they do prohibit.
> > >
> > >
> > > John
> > >
> > >
> >
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: files.tar.bz2
Type: application/x-bzip2
Size: 17437 bytes
Desc: not available
Url : http://lists.idyll.org/pipermail/twill/attachments/20060810/a90d3fdf/attachment-0001.bin
More information about the twill
mailing list