[twill] Loading cookies (was Re: cookie client)
Dolf Andringa
dolf.andringa at elcyion.nl
Fri Jan 26 01:55:30 PST 2007
Hi everybody,
This question has been dead for a while so I guess it has been solved. I
just found out the solution myself for scopus, so for archive purposes,
here it is:
Scopus uses a crawler protection by checking the user Agent. If it is
python/urllib it will redirect you to a crawlerprotection.url page. Just
set the User Agent header to mozilla 5.0 and the problem is solved:
cj=cookielib.CookieJar()
opener=urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders=[('User-agent','Mozilla/5.0')]
f=opener.open(url)
print f.read()
Dolf.
irina nudelman schreef:
> Hi John,
>
> Thank you for your suggestions! I didn't have a chance to try them today
> at work. I will try them on Mon. I'm on Windows and I'm using the
> Mozilla Firefox browser, but I could use the explorer as well.
>
> Thank you!
> Irina.
>
>
> */John J Lee <jjl at pobox.com>/* wrote:
>
> On Thu, 3 Aug 2006, John J Lee wrote:
> [...]
> > 2. Tell twill where to find your "cookie jar" with the "load_cookies"
> > command.
> [...]
>
> If you don't have the original password, this is what you have to
> do, of
> course.
>
> Unfortunately, it looks like twill doesn't support loading Firefox or
> Internet Explorer cookies yet (it can load and save cookies, but
> only in a
> special format not used by browsers). ClientCookie and mechanize can
> load
> Firefox and IE cookies, however. I'll try and point you in the right
> direction, but first: Are you on Windows? Which browser are you using?
>
>
> Titus -- why not have twill's load_cookies load the IE cookies if
> you're
> running on Windows and there's no cookie jar argument (by
> instantiating an
> MSIECookieJar and using .load_from_registry())? That's the preferred
> way
> to load IE cookies (always assuming that MSIECookieJar still works
> -- it
> parses an undocumented MS file format...). If there's an argument, I
> think twill should attempt to .load() using each class in turn --
> LoadError tells you it's the wrong format for that class
> (MSIECookieJar,
> MozillaCookieJar, LWPCookieJar -- order shouldn't matter). If you add
> support for Firefox cookie saving, you should also add the warnings
> about
> this from the mechanize docs (IIRC, that a running Firefox may clobber
> your changes, and you should back up any cookie files that contain
> important cookies). I think save_cookies without an argument should use
> the filename and file format that were used on the previous
> load_cookies.
> Finally, note MSIECookieJar does not support saving, and iteration
> won't
> cause loading of cookies unless .delayload is true. The delayload thing
> can be significant for MSIECookieJar because each cookie is in its own
> little file (or something similar, I forget), so I suggest having
> delayload=False until somebody asks for show_cookies, at which point
> you
> should call .load_all_cookies() before iterating.
>
>
> John
>
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to
> share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> wwwsearch-general mailing list
> wwwsearch-general at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
>
>
> ------------------------------------------------------------------------
> Do you Yahoo!?
> Next-gen email? Have it all with the all-new Yahoo! Mail Beta.
> <http://us.rd.yahoo.com/evt=42241/*http://advision.webevents.yahoo.com/handraisers>
>
>
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys -- and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> wwwsearch-general mailing list
> wwwsearch-general at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/wwwsearch-general
More information about the twill
mailing list