[linux-elitists] Notify about using the e-mail account.

Aaron Sherman ajs@ajs.com
Wed Mar 3 14:49:17 PST 2004


On Wed, 2004-03-03 at 17:25, Karsten M. Self wrote:

> I see the prior strategy of specifically coding in tests for specific
> words and/or phrases as less critical.  Though it's convenient to have
> Nigeria spam identified for me in SpamAssassin headers.

I made some major overhauls to the phrase tests in SA a while back and
submitted the work (along with an analysis of its impact) and it was
semi-rejected. In the end, I partially agreed with their take.

They contend that brain-dead phrase testing with all its expense is
essentially the fodder that the Bayesian filter grows up on. If you try
to smarten it too much, you remove a lot of the value that is derived by
Bayes in order to train itself faster and more accurately than a pure
Bayes approach (even with a seeded token database) ever could.

I saw their point in this, though I still think my changes (still in
their Bugzilla and awaiting some future grand overhaul of phrase
testing) would not have compromised the Bayes auto-learning while
providing a major boost in phrase testing performance... but that's a
topic for another day.

-- 
Aaron Sherman <ajs@ajs.com>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback





More information about the linux-elitists mailing list