rejecting spam at SMTP time (was Re: Postfix anti-antivirus (was Re: [linux-elitists] etc))

Karsten M. Self kmself@ix.netcom.com
Fri Sep 24 21:18:27 PDT 2004


on Fri, Sep 24, 2004 at 10:19:09AM -0700, Don Marti (dmarti@zgp.org) wrote:
> begin  Karsten M. Self quotation of Fri, Sep 24, 2004 at 03:02:00AM -0700:
> 
> > Which user, and harder than what, Don?
> 
> Alice finishes her work on a document on Friday night, sends it to
> bob@example.com, and heads to Tahoe.  The next morning, Bob can't find
> it in his inbox, and fortunately doesn't have to reach either
> postmaster@example.com or Alice because he can go to his spam page and
> search for it.

The disconnect you've got, Don, is thinking that sending an SMTP-time
reject means:

  1. No logging of incoming mail attempts (Bob can at least check to see
     if Alice _tried_ to send).

  2. No spooling of at least a portion of spam-classified mail.  Say,
     low-scoring stuff, or the first few instances from a given spammy
     netblock.

Sure:  firewall rules can block entirely, but if you're making your
decisions based on MTA actions, you've got a lot of flexibility,
including "send a 'we don't accept' but keep a copy anyway" message.

The other side is that there are enough _other_ cracks in the email
ediface that this one counts as relatively minor, all considered.  Sure,
you can come up with instances where Real Harm is done.  But if your
messages are that critical, take out-of-band measures to ensure they're
delivered or have been received.

 
> > First:  SpamAssassin will be impelementing some form of ASN/CIDR scoring
> > in a near future release.  In its simplest form, this means that the
> > network-of-origin will be determined and an overall spaminess/haminess
> > rating for it computed (and likely a volume metric as well).  This all
> > pretty much just falls out of creating a token and letting the Bayesian
> > classifier go to town on it.
> 
> The more tokens the better.  Even something as simple as a
> time-of-day/day-of-week token could help catch mail that originated
> outside the time zone or zones of your regular correspondents. 

Some of which is already in SA, at least insofar as mail with improbable
dates (future, distant past) is considered spammy.

My understanding of the Bayesian classifier is that all headers save
"From", "To", "Subject", and "Date" are pretty much classed together,
and fall into the Bayes token pool.
 
> > Second:  reporting of such stats may be of some utility in getting
> > networks to shape up.  While my top contributor, KORNET, has held
> > its first-place ranking for the nine months I've watched, several
> > other players (notably Telstra and SBC) have entered and exited the
> > top five slots.  I don't know if it's me, but it's pretty clear that
> > if you can be readily classified and identified as a spamhaus, _and_
> > you have legitimate business interests at odds with that moniker,
> > you might want to fix the problem.
> 
> Dude, I think you could be the next Netcraft here.  You have
> information that every marketing person with an interest in getting
> legit bulk mail through will want.

Which is why you want an aging factor worked in.

This is one of the few things that actually worked out well in the
Kuro5hin moderation / mojo system.  Data were aged, so that measurements
from the past counted less than current ones.  It was a simple inverse
days elapsed, set at IIRC 60 or 90 days total.  Which means you could
have a golden reputation, two months ago, then no activity, then
suddenly turn black hat.  The system would weight the current
information 60x more than the old stuff, in this case.

Which is the benefit of _locally_ generated aggregated point-of-origin
statistics.  You're looking at your own experience with a remote
sender's space, not some central authority's say-so.

There's alsot the point that a clean net would likely be so because of
its good network hygiene practices.  Spammers wouldn't find ready entre,
and would find themselves quickly booted, from a good net.  And if the
network had just been lucky until now, the stats would turn pretty
quickly.


 
> No match for "DELIVERABILITY.NET".

http://www.senderbase.com/

Not exactly the same thing, but close.


Peace.

-- 
Karsten M. Self <kmself@ix.netcom.com>        http://kmself.home.netcom.com/
 What Part of "Gestalt" don't you understand?
    Vote Bush in '04: "I Has Incumbentory Advantitude"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://allium.zgp.org/pipermail/linux-elitists/attachments/20040924/3ff7e68f/attachment.pgp 


More information about the linux-elitists mailing list