[linux-elitists] Notify about using the e-mail account.

Aaron Sherman ajs@ajs.com
Wed Mar 3 13:33:55 PST 2004

On Wed, Mar 03, 2004 at 10:43:51AM -0800, Karsten M. Self wrote:

> Rulesets only get you so far.  The nice thing about the Bayesian
> classifiers is that they are automatically adaptive.

Very true!

And that's why SA isn't just a ruleset. SA uses a large number of approaches
including static text-matching rules; Beyesian scoring; DNS blacklists;
distributed checksums; etc. You note later the DNSBL and checksum
tests, but not Bayes, I just thought you might want that extra piece.

D-Spam is also quite nice, and I recommend it or SA. There are problems
with SA, but for the most part it does an excellent job. Please note that
viruses are a bad data-point. There are very good ways to look for
viruses, and you can even do it in hardware these days (via devices
that regularly download a signature list and look at network traffic,
disconnecting any sessions that involve viruses). SA is NOT designed
around viruses for this reason, and focuses purely on commercial
spam. Of course, that may change....

> And there is a new class of adaptive filters which promises 99.97+%
> effectiveness at spam filtering.  These scan "windows" of text rather
> than single-word naive Bayesian analysis.

Naive Bayes isn't used by ANY spam filters anymore is it?! If yours
does, dump it fast!

> The advantage of Spamassassin has pretty much always been that it's a
> _framework_ into which you can drop arbitrary spam-detection methods.

Yes, although, I think D-Spam has a better approach. It started off
as pure-Bayes and has been slowly moving other tests in as tokens that
Bayes uses. I've been recommending this to the SA folks too, but they're
sticking with the GA for now. GAs are nice, but I don't like how slow
the SA rules are to adapt (due to the fact that scoring is static per
version). Granted, the fact that some of those scores represent dynamic
sources (DNSBL, Bayes, etc.) makes it worthwhile, but probably not 
ideal... IMHO.

