[linux-elitists] e-mail scriptrollas

Nick Moffitt nick@zork.net
Thu Sep 11 22:00:56 PDT 2003

begin  Adam Kessel  quotation:
> While we're on this metatopic, I do almost all my email on BART, so
> I really appreciate the inclusion of article text in postings. 

	This brings me to an idea that I'd like to post in the spirit
of establishing prior art.  

	So far there seem to be a few major mechanisms for managing
e-mail for someone who spends time at more than one computer.  I shall
name them the POP method, the IMAP method, the ssh method, the rsync
method, and the MX method.

	POP: Pretty simple, you download a bunch of e-mail that you
	    haven't downloaded before.  It's one of the early
	    mechanisms used for reading mail on a machine other than
	    the one the mail was delivered to.
		pros: simple, standard, works everywhere, good for
			disconnected use.
		cons: if you move from machine to machine, you leave
			pieces of your mail on those boxes, or at
			least you can't save critical mails to an
			important folder that you can get to from your
			main system.  Also, you have to wait for a
			large mail to finish before getting to the one
			after it in the mail spool.

	IMAP: A more advanced pseudo-NNTP-like protocol, where you get
	    lists of message-IDs, and tell the server what to fetch,
	    what to delete, what to save where.
	    	pros: far more flexible and powerful than POP, central
			storage.  You can decide what not to download
			based on headers.
		cons: not many well-done implementations, need full
			connection during mailreading to take
			advantage of pros.

	ssh: Just ssh to the box your mail is on, and run
	    elm/mutt/mailx/less/mh/gnus/X-forwarded GUI client.
		pros: fully centralized, uniform interface,
			well-controlled setup.  Client support is
			almost as good as POP, especially if you have
			something like Mindterm SSH set up.
		cons: need full connection throughout no matter what,
			network lag or glitches is felt immediately at
			the user interface level.

	rsync: rsync your mailboxes to the machine you want to read
	    on.  If you've got a single sequentially-appended inbox,
	    and/or use rsync's flags cleverly, you can often make this
	    perform the same as POP.  Thus, the pros/cons are
	    comparable.  Using MH/Maildir changes them somewhat.

	MX: Make your laptop the main MTA host, and your mail receipt
	    server a backup MX.  When you bring your laptop up, dyndns
	    or similar gets the hostname, and a script or other event
	    causes a flush of the queue from the backup MX host.  You
	    may see UUCP used for this.
	    	pros: Fire-and-forget.  You can treat your
			laptop/zaurus/whatever like an actual Internet
			mail server with a flakey connection, and just
			let SMTP, UUCP, and their ilk do all the work
			of sending and receiving.
		cons: Unpredictable, also batshit crazy.

So I personally use the ssh method exclusively.  When I was going on a
long train ride and couldn't trust to get network access for more than
twenty minutes at a time, I used rsync.  The problem I noticed was
that rsync has no intelligence about mbox format (and rightly so!).
If I marked a message as read in my laptop's spool, and a new mail
came in on the server's spool, I had no way of merging changes.

What I propose is a system to manage changes made on a well-configured
laptop/zaurus/whatever that creates a "mailreading script" which gets
sent to the server side.  This could be formail, or it could be a
script format of a completely new style (perhaps driving a formail
wrapper of some sort).

Step 1:  In San Francisco, Alice rsyncs /var/mail/alice from
    bob.zork.net to her laptop, without using --delete.  She now has a
    complete copy of the spool on each system (msgids
    <oldbob@bob.zork.net>, <242@emad.xyzzy.oh>, and
    <777777@make.monkey.fast>).  She packs up the laptop and hops on
    BART, catching the train to Berkeley.

Step 2: While Alice waits on the platform, Bob sends her a mail (msgid
    <newbob@bob.zork.net> that is appended onto /var/mail/alice on
    bob.zork.net.  Alice does not have a copy of this on her laptop

Step 3: Alice pops open her laptop to listen to oggs with her
    noise-cancelling headphones as the 10-car Richmond train roars
    through the transbay tube at 80MPH.  She reads
    <oldbob@bob.zork.net>, and saves it to ~/mail/bob.  She reads
    <242@emad.xyzzy.oh> and leaves it in her spool, but no longer
    marked New.  She finds <777777@make.monkey.fast> to just be spam,
    and deletes it.  As she takes these actions, her MUA catches them
    and generates a log:

Status	<oldbob@bob.zork.net>	RO
Save	<oldbob@bob.zork.net>	=bob
Status	<242@emad.xyzzy.oh>	RO
Delete	<777777@make.monkey.fast>

Step 4: Alice opens up her laptop in an 802.11b hotspot in Berkeley.
    Her network start events send the log from Step 3 to bob.zork.net
    and execute them.  The result is that on bob.zork.net (since it's
    an mbox), a "Status: RO" is added to <oldbob@bob.zork.net> and
    <242@emad.xyzzy.oh>, then <oldbob@bob.zork.net> is chopped from
    the spool and moved to ${MAIL}/bob, and <777777@make.monkey.fast>
    is deleted from the spool.  Again, all these changes take place on

Step 5: upon success of the log-script, rsync starts up again,
    bringing the deletions of the two messages and the addition of the
    Status: header down to her local spool, as well as the message

Part of the beauty of this is that you can make the scripts rather
tolerant of interference on the server-side spool.  Between steps 3
and 4, alice could have hit the Berkeley Public Library and ssh'd to
bob.zork.net and read all four mails in question.  The script would
find the "Status: RO" already there and just pass over them.  Deleting
message IDs that you can't find is also an easy ignore.  Saves that
can't even be found in the destination folder might be worth a
notification mail, but at least there's a copy on the laptop if
something really got lost (perhaps more advanced recovery could
actually automatically grab that copy, but I'm wary).

	PROS: This system brings kind of the best of both worlds in
	    the POP and ssh mechanisms.  Your mail is all in your
	    favorite format on your favorite box, but you temporarily
	    fiddle with a cache copy, and the journal of your actions
	    gets checkpointed when you need to.
	CONS: It's a programatic script munging your mail spools and
	    folders.  This opens up all sorts of disaster scenarios,
	    but I feel these can be overcome with more fault-tolerant
	    approaches (perhaps all deletes move to a temporary trash
	    folder for you to logrotate or sift through).

It's possible that this is simply the sort of thing that appropriate
IMAP use will grant you (provided your IMAP server does mbox
securely).  It's just that many people have advocated what you *can*
do with IMAP, but nobody seems to have ever written anything that lets
you *actually do it*.

Support your droogs!


More information about the linux-elitists mailing list