[linux-elitists] web server software for tarpitting?

James Sparenberg james@linuxrebel.org
Wed Feb 13 17:29:42 PST 2008


On Tuesday 12 February 2008 10:33:17 Gerald Oskoboiny wrote:
> * Evan Prodromou <evan@prodromou.name> [2008-02-12 12:37-0500]
>
> >On Sun, 2008-02-10 at 23:06 -0800, Gerald Oskoboiny wrote:
> >> The other day we posted an article [1] about excessive traffic
> >> for DTD files on www.w3.org: up to 130 million requests/day, with
> >> some IP addresses re-requesting the same files thousands of times
> >> per day. (up to 300k times/day, rarely)
> >>
> >> The article goes into more details for those interested, but the
> >> solution I'm thinking will work best (suggested by Don Marti
> >> among others) is to tarpit the offenders.
> >
> >...and not punish everybody else, right?
>
> Right, just punish those who are abusive.
>
> >>      W3C's current traffic is something like:
> >>
> >>        - 66% DTD/schema files (.dtd/ent/mod/xsd)
> >>        - 25% valid HTML/CSS/WAI icons
> >>        - 9% other
> >
> >It sounds like W3C has been having a problem satisfying its promises,
> >then. When you publicize an URL, like a DTD or schema, you're giving
> >some tacit permission to use that URL.
>
> Yes, but a single IP address re-fetching the same URL thousands
> or hundreds of thousands of times a day seems excessive.

Perhaps  a mirroring system is needed?  Also a change to the standard that
says straight up that the DTD needs to be brough down local and cached.  What 
seems to be missing, IF, I understand the problem space correctly, is a means 
where by high volume users can cache a local copy. Yet, have a reasonable 
assurance that if the master is updated (by w3c.org in this case,) this 
update is propagated in a manor not dissimilar to, oh, a dns update.  People 
could grab and hold a copy of the master, with a TTL of say 1 week, 
dramatically lowering your overhead (or anyone else's) while at the same time 
creating the ability to have master control without fragmentation.  

James



More information about the linux-elitists mailing list