[linux-elitists] web server software for tarpitting?

James Sparenberg james@linuxrebel.org
Fri Feb 15 19:24:42 PST 2008


On Thursday 14 February 2008 19:27:48 Gerald Oskoboiny wrote:
> * James Sparenberg <james@linuxrebel.org> [2008-02-13 17:29-0800]
>
> > On Tuesday 12 February 2008 10:33:17 Gerald Oskoboiny wrote:
> > > Yes, but a single IP address re-fetching the same URL thousands
> > > or hundreds of thousands of times a day seems excessive.
> >
> > Perhaps  a mirroring system is needed?  Also a change to the standard
> > that says straight up that the DTD needs to be brough down local and
> > cached.  What seems to be missing, IF, I understand the problem space
> > correctly, is a means where by high volume users can cache a local copy.
> > Yet, have a reasonable assurance that if the master is updated (by
> > w3c.org in this case,) this update is propagated in a manor not
> > dissimilar to, oh, a dns update.  People could grab and hold a copy of
> > the master, with a TTL of say 1 week, dramatically lowering your overhead
> > (or anyone else's) while at the same time creating the ability to have
> > master control without fragmentation.
>
> HTTP has all this stuff built in already, it's just being
> ignored. These resources generally haven't changed in 5+ years,
> are unlikely to change in the future, and are served with
> explicit expiry times of 90 days to a year. There's no reason
> anyone should need to fetch them even once a week, let alone
> thousands of times a day.

The Once a week was an example.  Not an absolute.  As for DNS caching, or 
things like it, after some recent problems in that space (not for my systems 
but rather for a customers data backend we access.) I'm not much  of a fan of 
that.  It's way too easy to poison the well   Again I'll agree, diligence 
builds reliability, but IMHO as long as development is along the lines 
of  "easy to create == let me do whatever I want."  and EULA's say "Not my 
fault if I screwed up". we won't get the diligence.

IMHO having the possibles available in HTML does little good if it's not in 
your face referenced.  In fact consider that many of the XML howto books seem 
to indicate that pulling down a DTD to cache locally is either a copyright 
violation, theft of IP, or impossible to make work, because it has to be 
remote.  

I really am wondering if the problem you are trying to solve can be solved 
with a fix on your servers.  From my experience people will spend more time 
working around such blocks than they would have spent just doing it right the 
first time.  

On the other hand.  Thanks to your thread in this forum, I'm doing a traffic 
audit on our use of DTD's at my company (we do vxml) to make sure we aren't 
hammering anyone if we can avoid it.  (I'm told there are some Sun DTD's for 
JavaXML that have to come from Sun Servers ... I'll have to see.)

James



More information about the linux-elitists mailing list