[linux-elitists] Notify about using the e-mail account.
Wed Mar 3 14:49:17 PST 2004
On Wed, 2004-03-03 at 17:25, Karsten M. Self wrote:
> I see the prior strategy of specifically coding in tests for specific
> words and/or phrases as less critical. Though it's convenient to have
> Nigeria spam identified for me in SpamAssassin headers.
I made some major overhauls to the phrase tests in SA a while back and
submitted the work (along with an analysis of its impact) and it was
semi-rejected. In the end, I partially agreed with their take.
They contend that brain-dead phrase testing with all its expense is
essentially the fodder that the Bayesian filter grows up on. If you try
to smarten it too much, you remove a lot of the value that is derived by
Bayes in order to train itself faster and more accurately than a pure
Bayes approach (even with a seeded token database) ever could.
I saw their point in this, though I still think my changes (still in
their Bugzilla and awaiting some future grand overhaul of phrase
testing) would not have compromised the Bayes auto-learning while
providing a major boost in phrase testing performance... but that's a
topic for another day.
Aaron Sherman <firstname.lastname@example.org>
Senior Systems Engineer and Toolsmith
"It's the sound of a satellite saying, 'get me down!'" -Shriekback
More information about the linux-elitists