[linux-elitists] web server software for tarpitting?
Wed Feb 13 17:29:42 PST 2008
On Tuesday 12 February 2008 10:33:17 Gerald Oskoboiny wrote:
> * Evan Prodromou <email@example.com> [2008-02-12 12:37-0500]
> >On Sun, 2008-02-10 at 23:06 -0800, Gerald Oskoboiny wrote:
> >> The other day we posted an article  about excessive traffic
> >> for DTD files on www.w3.org: up to 130 million requests/day, with
> >> some IP addresses re-requesting the same files thousands of times
> >> per day. (up to 300k times/day, rarely)
> >> The article goes into more details for those interested, but the
> >> solution I'm thinking will work best (suggested by Don Marti
> >> among others) is to tarpit the offenders.
> >...and not punish everybody else, right?
> Right, just punish those who are abusive.
> >> W3C's current traffic is something like:
> >> - 66% DTD/schema files (.dtd/ent/mod/xsd)
> >> - 25% valid HTML/CSS/WAI icons
> >> - 9% other
> >It sounds like W3C has been having a problem satisfying its promises,
> >then. When you publicize an URL, like a DTD or schema, you're giving
> >some tacit permission to use that URL.
> Yes, but a single IP address re-fetching the same URL thousands
> or hundreds of thousands of times a day seems excessive.
Perhaps a mirroring system is needed? Also a change to the standard that
says straight up that the DTD needs to be brough down local and cached. What
seems to be missing, IF, I understand the problem space correctly, is a means
where by high volume users can cache a local copy. Yet, have a reasonable
assurance that if the master is updated (by w3c.org in this case,) this
update is propagated in a manor not dissimilar to, oh, a dns update. People
could grab and hold a copy of the master, with a TTL of say 1 week,
dramatically lowering your overhead (or anyone else's) while at the same time
creating the ability to have master control without fragmentation.
More information about the linux-elitists