[twill] Can't login to MySpace
twill.overbored at spamgourmet.com
twill.overbored at spamgourmet.com
Wed Aug 9 22:04:37 PDT 2006
So sorry - that was totally wrong.
In fact, both versions of mechanize work (I was wrong because of yet
more Python import confusion). I.e., using the latest darcs checkout
of twill, this works:
from twill.other_packages.mechanize import *
b=Browser()
b.open('http://myspace.com')
b.select_form('theForm')
b['email']='myname at myhost'
b['password']='mypass'
print b.submit().read()
Yet the following fails!
from twill.commands import *
go('http://myspace.com')
This is most bizarre, given that the twill simply invokes open(). I
made sure that there were no other packages - just cd to a directory
that isn't the twill darcs directory and then make sure that importing
BeautifulSoup, twill, ClientForm, and mechanize all fail. The
exception is:
/export/home/bob/tmp/work/twill2/twill/commands.py in go(url)
102 Visit the URL given.
103 """
--> 104 browser.go(url)
105 return browser.get_url()
106
/export/home/bob/tmp/work/twill2/twill/browser.py in go(self, url)
112 for u in try_urls:
113 try:
--> 114 self._journey('open', u)
115 success = True
116 break
/export/home/bob/tmp/work/twill2/twill/browser.py in _journey(self,
func_name, *args, **kwargs)
499 func = getattr(self._browser, func_name)
500 try:
--> 501 r = func(*args, **kwargs)
502 except urllib2.HTTPError, e:
503 r = e
/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in open(self, url, data)
128 if self._response is not None:
129 self._response.close()
--> 130 return self._mech_open(url, data)
131
132 def _mech_open(self, url, data=None, update_history=True):
/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in _mech_open(self, url, data, update_history)
168 ## # acceptable.
169 ## raise
--> 170 self.set_response(response)
171 if not success:
172 raise error
/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_mechanize.py
in set_response(self, response)
211
212 self._response = response
--> 213 self._factory.set_response(self._response)
214
215 def geturl(self):
/export/home/bob/tmp/work/twill2/twill/utils.py in set_response(self, response)
390 else:
391 self.factory = self.basic_factory
--> 392 self._cleanup_html(response)
393
394 def links(self):
/export/home/bob/tmp/work/twill2/twill/utils.py in _cleanup_html(self, response)
426
427 self.factory.set_response(FakeResponse(self._html, self._url,
--> 428 response.info()))
429
430 def use_BS(self):
/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
in set_response(self, response)
576 if response is not None:
577 data = response.read()
--> 578 soup = self._soup_class(self.encoding, data)
579 self._forms_factory.set_response(response, self.encoding)
580 self._links_factory.set_soup(
/export/home/bob/tmp/work/twill2/twill/other_packages/mechanize/_html.py
in __init__(self, encoding, text)
319 self._encoding = encoding
320 ### @CTB
--> 321 BeautifulSoup.BeautifulSoup.__init__(self, text)
322
323 def handle_charref(self, ref):
/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in __init__(self, *args, **kwargs)
1324 if not kwargs.has_key('smartQuotesTo'):
1325 kwargs['smartQuotesTo'] = self.HTML_ENTITIES
-> 1326 BeautifulStoneSoup.__init__(self, *args, **kwargs)
1327
1328 SELF_CLOSING_TAGS = buildTagMap(None,
/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in __init__(self, markup, parseOnlyThese, fromEncoding, markupMassage,
smartQuotesTo, convertEntities, selfClosingTags)
971 self.markupMassage = markupMassage
972 try:
--> 973 self._feed()
974 except StopParsing:
975 pass
/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in _feed(self, inDocumentEncoding)
996 self.reset()
997
--> 998 SGMLParser.feed(self, markup or "")
999 SGMLParser.close(self)
1000 # Close out any unfinished strings and close all the open tags.
/usr/lib/python2.4/sgmllib.py in feed(self, data)
93
94 self.rawdata = self.rawdata + data
---> 95 self.goahead(0)
96
97 def close(self):
/usr/lib/python2.4/sgmllib.py in goahead(self, end)
127 i = i+1
128 continue
--> 129 k = self.parse_starttag(i)
130 if k < 0: break
131 i = k
/usr/lib/python2.4/sgmllib.py in parse_starttag(self, i)
278 j = j+1
279 self.__starttag_text = rawdata[start_pos:j]
--> 280 self.finish_starttag(tag, attrs)
281 return j
282
/usr/lib/python2.4/sgmllib.py in finish_starttag(self, tag, attrs)
309 method = getattr(self, 'do_' + tag)
310 except AttributeError:
--> 311 self.unknown_starttag(tag, attrs)
312 return -1
313 else:
/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in unknown_starttag(self, name, attrs, selfClosing)
1153 self.currentData.append('<%s%s>' % (name, attrs))
1154 return
-> 1155 self.endData()
1156
1157 if not self.isSelfClosingTag(name) and not selfClosing:
/export/home/bob/tmp/work/twill2/twill/other_packages/BeautifulSoup.py
in endData(self, containerClass)
1055 def endData(self, containerClass=NavigableString):
1056 if self.currentData:
-> 1057 currentData = ''.join(self.currentData)
1058 if currentData.endswith('<') and self.convertHTMLEntities:
1059 currentData = currentData[:-1] + '<'
I spent a good couple of hours trying to find out what's going on but
I give up. So close! Details: the former script never even executes
endData(), while the second does and finds 0xc2 in
self.currentData[-1][0]. Anybody know what's up?
On 8/9/06, I wrote:
> OK, the problem was an outdated version of mechanize packaged with
> twill. I tested both the mechanize from its own subversion repository
> and the one included in the latest (darcs) version of twill, and only
> the latter worked.
>
> Titus, do you plan to update the components used by twill (in
> particular, the mechanize library, so that this problem is fixed)?
> Thanks.
>
> On 8/8/06, John J Lee - jjl at pobox.com
> <> wrote:
> > On Tue, 8 Aug 2006, twill.overbored at spamgourmet.com wrote:
> >
> > > Would you mind sharing the code snippet to make this happen? What
> > > version of mechanize did you use? (The one included with twill?)
> > > Thanks.
> > [...]
> >
> > I didn't do anything special, and used mechanize SVN with Python 2.5.
> >
> > I won't post/email the ten line script: I don't want to encourage scraping
> > against terms of use (it seems the no-scrape clause is rather a legal
> > reflex action these days). Simply logging in doesn't appear to be against
> > their terms, but I assume in your case logging in would be a prelude to
> > some automated action, which they do prohibit.
> >
> >
> > John
> >
> >
>
More information about the twill
mailing list