[linux-elitists] web server software for tarpitting?
Tue Feb 12 11:07:26 PST 2008
Evan Prodromou wrote:
> On Sun, 2008-02-10 at 23:06 -0800, Gerald Oskoboiny wrote:
>> W3C's current traffic is something like:
>> - 66% DTD/schema files (.dtd/ent/mod/xsd)
>> - 25% valid HTML/CSS/WAI icons
>> - 9% other
> It sounds like W3C has been having a problem satisfying its promises,
> then. When you publicize an URL, like a DTD or schema, you're giving
> some tacit permission to use that URL. Why are you now trying to
> penalize those people who actually bought the story and are using the
"Using" is the term in question here. We're not talking about
constructive access to these URLs, but poorly implemented software that
is resorting to polling W3C where it should not. Well behaved software
does not do this. Why should W3C foot the bill for poorly behaved software?
Making access to W3C degrade performance of poorly written software is a
fine way to deal with this. Such software can be trivially fixed to
avoid the degradation.
> It seems to me the way to solve your problem is to:
> 1. Clarify and publicize best practises for using W3C resources
> into a server use policy. How often is it OK to hit a W3C-hosted
> DTD? Once a day? Once an hour? Once a minute?
Typically, it should be almost never. There's no reason for an app that
actually needs the DTDs not to have a copy of them handy for local
access. Checking to see if the DTD has changed once a week or month
might be reasonable, using a HEAD method HTTP call. Beyond that, there's
simply nothing in the DTDs that needs to ever change on a day-by-day
basis. If, by some oddity, a security problem should involve the way a
DTD is handled, the app should be the one pushing out the fix, not the
W3C. Anything else can be fixed on a much longer-term basis.
> 2. For absolutely terrible bad-behavers, block them by IP number --
> or return a brief-as-possible HTTP 403 response with a link to
> your server use policy . It sounds like a quick way to cut down
> on your traffic and save some headaches.
Tar-pitting hurts end-users less. Having my app take an extra 30 seconds
to do something is far less damaging in many cases than having it fall
over (which is what many such apps will do if they can't access these
DTDs, which is why we call them poorly written apps). I'm thinking of
things like reservation systems here, where a hotel might be hurt by
having their reservation GUI slow down, but they'd be crippled by having
it simply stop.
> 3. Build a content-distribution network (CDN) to free up your
> servers for the important stuff. You could either pony up the
> cash for a commercial CDN, or you could use W3C's goodwill in
> the Web community to put together a free and informal system of
It's easy to arm-wave and demand that a faceless organization spend
money on your behalf. In truth, we're talking about the W3C, and
organization that is at the heart of defining one of the most important
tools of human communications of the previous millennium, and almost
certainly of the current one. Addressing early the issues of economic
and infrastructural scaling is in, literally, all of humanity's best
interest. That's not something you get to say without hyperbole often.
More information about the linux-elitists