[linux-elitists] web server software for tarpitting?

kermit tensmeyer klt_7940@fastmail.fm
Sat Feb 23 19:06:30 PST 2008


On Sat, 23 Feb 2008 08:59:02 -0800, Don Marti wrote:

> begin kermit tensmeyer quotation of Thu, Feb 21, 2008 at 12:05:08AM
> +0000:
> 

>>    No it can't. Before "trivally fixing the software", it should be a
>> requirment to fix the standards as defined by W3C. If one defines the
>> DTD for XHTML as authortative in one (and only one) URI, then there is
>> only one logical location that can provide authortative answers.
> 
> The http:// URL implies the HTTP standard.  If the HTTP response
> includes "Expires:" than the client is allowed to rely on that response
> until the "Expires:" date.

 Does the http response for the dtd GET on w3c include an expires? It 
didn't the last time I looked. But the last time I looked was a while 
ago. 

> 
>>   the solutions are to fix the standards, and then focus on allowing
>> software to match the standards. DTD validation has been around for a
>> long time. Quick fixes  will generate other problems that can be worse
>> than the problem it was intended to fix
> 
> Already fixed at the HTTP level.  Caching the DTD until it expires is
> the only approach that's consistent with both the HTTP standard and the
> unwritten HTTP client rule of not being a dick to the people who run the
> HTTP server.


  Funny thing.  The people at w3c are (collectivly) responsible for the 
standard. The issue of caching isn't apart of the validation software, 
but is likely an issue with which ever service/library/ that 
fetches remote components. [[ LibXML2 which validates XML uses a set of 
glib routines to fetch remote components. Software which calls components 
that use LibXML2 don't have] a clue as to which function is used to fetch 
remote pages, or how some other hidden configuration catalog points to 
some URI locater which may be local or remote.]] [[Saxon and Xalan use 
Xerces or other tools do the job of handling URI's

 quick fixes being generalized causing problems include proxy caching 
requirements, catalog resource servers. In environments where DTD/Schema 
validation is anticipated and expected, some adjustments can be made.

Where the validation is incidental to the primary focus, it's harder to 
compensate

 This recomendation for tarpitting domains based on DTD repeated fetches 
will cause problems because most of the people with degraded performance 
issues, won't have any connection with the issues causing the problem.

 If you can limit the impact, to the speed with which the effected DTD or 
Schema's are provided by the w3c server, then the degraded user 
performance 
can be fixed by the groups who notice the problem and can fix it.

>>    Having some corporate Email Server run slower because the choice of
>> email provider software doesn't play nice, may be reasonable if and
>> only if there is an alternative that doesn't incur the penalty
 
> Already-deployed corporate software gets fixed only if someone with
> budget authority complains, or if there's a credible legal threat.
> 
> Remember the great Wisconsin NTP DoS of 2003?
>   http://pages.cs.wisc.edu/~plonka/netgear-sntp/
>   http://pages.cs.wisc.edu/~plonka/netgear-sntp/#ToC44

   Why no I don't remember. Did UofW have a problem in 2003?  Not even a 
blip on the radar
   
   What's your point...
   
    Do you remember the Internet Worm, that shutdown seismo ames and 
ucbvax?
    Do you have any idea what seismo or ames or ucbvax was?

> 
> This situation seems different because it's not one badly designed
> product, but zillions of in-house programs.  In this case, the only way
> I can see to make someone complain is to slow down the software. The
> developer who's ordered to make the fix wouldn't even have to know about
> the policy.

  It isn't a few million in-house programs that are the problem, it's the 
majority of the internet available software. All most everything that has 
anything to do with XML, SGML or HTML is impacted by these changes.
  
  It's not a choice of the Programmer or Developer, it really is at the 
end of a chain of library calls and it may not be anything that the 
programmer for the in-house software can change



> 
> It seems like the most obvious fix would be for HTTP client libraries to
> cache where possible by default, and make the programmer turn off the
> cache if he or she really didn't want it.

OK... how do I turn on http caching for the validation software for the 
SOAP processor, and is that different from the http caching used by the 
catalog servince invoked by the logging function for the JAVA web 
container server? There must be somewhere where all those options are 
documented ...:-(




More information about the linux-elitists mailing list