[twill] Configuration for check_links extension

Titus Brown titus at caltech.edu
Sun Jul 9 09:57:30 PDT 2006


On Fri, Jul 07, 2006 at 10:17:29AM -0400, Hancock, David (DHANCOCK) wrote:
-> We're using the check_links extension, but on some of the pages we
-> monitor with it, we have hundreds of links that are to servlet on our
-> own site, sometimes the same servlet over and over, with a different GET
-> parameter. This unnecessarily loads our server.
-> 
-> What we'd REALLY like to do is only check offsite links, so my questions
-> are two-fold:
-> 
-> 1. What's the right way to add a configuration parameter? I tried
-> whacking on a local copy of the extension, copying the style of the
-> single configuration parameter that is in there. Didn't work, and I
-> didn't want to break it further.
-> 
-> 2. What's the best way to ignore "local" links? I tried to examine the
-> url variable, looking for 'https:' before it got processed. Our local
-> servlets show up in anchor tags without the protocol (<a
-> href="/ADC/AirportInfo?ap=KBWI">, for example). I couldn't get this to
-> work.
-> 
-> If anyone can point me in the right direction, I'd be grateful.

Hi, David,

I'm not sure where the problem lies with approach #1, although I bet
it's a Python import problem (extend_with is a simple wrapper around
'import').  Perhaps you could try renaming the module as something
else...

Regardless, it sounds like this is a good candidate for some work on my
part.  Is your main goal just #2, or were there additional goals?

I think #2 didn't work because in check_links the full URL is used by
check_links; whether or not it's relative doesn't come through at all.
So you would actually have to specify which domain(s) to *avoid*, which
should be easy to add.

cheers,
--titus



More information about the twill mailing list