[linux-elitists] (tmda) Re: Constraining Bogus challenges.

Matt Beland matt@rearviewmirror.org
Tue Sep 23 12:23:35 PDT 2003

Hash: SHA1

On Tuesday 23 September 2003 11:17 am, Larry M. Augustin wrote:
> This is a great example of lying with statistics.  I've done an extensive
> cross-product survey of content/context filtering, and on average the
> numbers are no where near that.
> I don't doubt that there exist people on this mailing list who have
> carefully tuned setups fitting their individual tastes that are able to
> achieve those rates.  However, for the non-technical user the tweaking
> necessary to achieve that level of accuracy is not an option.  For the
> typical user of anti-spam systems based on content/context filtering,
> accuracy is more like 75%.

I call "bullshit".

I run a mail server for about 300 users, almost all of whom are non-technical, 
some to the extreme that they don't even know how email works beyond "click 
this to write a new message, and this to reply to one you received". On that 
mail server, I run a plain-vanilla SpamAssassin 2.55 (upgraded half an hour 
ago to 2.60) installation. No special configurations. "Spam" score set to 
8.0. No network tests, either; all internal plus the built-in Bayes 
auto-learning. Observed "failure" rate is about .5%, starting at damned near 
0% and gradually increasing as spammers adjust to the new rulesets. Of 
course, it takes time for content filtering to become effective; with 
SpamAssassin, I believe it takes 3000 messages before the Bayes filters 
really kick in and start affecting the scores. But there's no configuration 
necessary for that; everything auto-learns until that threshold is reached. 
You *can* tweak it, you *can* run additional saved messages to train it 
faster. But if you don't, it won't make it any less effective - it just takes 
longer for the "dictionary" to be built up. You, of course, couldn't have 
been bothered to learn about that or take it into account during your 
"extensive cross-product survey".

On the other hand, I've gotten several complaints from these users about C-R 
"verification" spams. Many of them don't understand them, they complain 
because they think it means "their email" is broken, several have been 
reported to me as spam. (I tend to agree, and with the crap I've heard from 
the C-R admins I've contacted about this, I think I'm finally going to 
"carefully tune" my "setup". Anybody run an RBL listing containing only 
servers known to run C-R systems?)

So, you actually haven't done jack to evaluate content-filtering systems, 
because if you had you'd know that your statement is blatently false. You 
just want to be a happily anti-social little prick and put the burden on 
other people to prove that they're worthy of communicating with you. If you 
whiney little pricks want to wall yourself off from the rest of the Internet, 
go right ahead. Spend all your time trading C-R requests and congratulating 
yourselves on how smart you are. May you get all you deserve.

Just keep your shit off my servers.

- -- 
Matt Beland
Version: GnuPG v1.2.2-rc1-SuSE (GNU/Linux)


More information about the linux-elitists mailing list