?

Log in

No account? Create an account
entries friends calendar profile my webpage Previous Previous Next Next
Posted on gmane.mail.mimedefang today. - Tina Marie's Ramblings
Red hair and black leather, my favorite colour scheme...
skywhisperer
skywhisperer
Posted on gmane.mail.mimedefang today.
Geeky mail-admin stuff follows.

I really like the idea of graylisting, but I'm a bit worried about 4 hour delays from machines with slow queue runners.

So here's my theory:

Everything scored by SpamAssassin over a certain threshold (set per domain on my box, but most of them are around 6 or 7) gets rejected out of hand.
Everything under a 0 is probably not spam.

I just want to graylist the middle stuff, very gently.

I see two tables: One would contain fields for IP, sender addy, and recipient addy, and the last time it was tried. Call this the "tempFailed" table. The other would contain an IP and a time - this would be the "retriedSuccessfully" table.

My theoretical filter_end would look like:

If it's a virus, reject it.
Call SA.
If it's going to be rejected, reject it.
If it's less then 0, deliver it.
Otherwise, grab the IP of the machine connecting.
Look it up in the retriedSuccessfully table.
If it's there, accept it, and update the time.
It it's not, look it up in the tempFailed table.
If it's there, accept it, and add it to the retriedSuccessfully table.
If not, add it to the tempFailed table, and send a 451 to the server.

I'd also need a cron job to expire the databases. I was thinking anything in tempFailed over a day old could be tossed, and the retriedSuccessfully table would hang around for a month or so after the last good mail.

This seems like a very lightweight implementation. I'm a little concerned about corner cases - what about server farms, where mx1, mx2...mx125 all send out data? The graylist whitepaper mentions this, but doesn't really have a very good solution. Could I leave out the IP on the tempFailed table and only look up the sender/recipient pair to get around this?

I'm planning to use mySQL, because, well, I have it. I don't see all that much email (peak for me is 100 messages an hour, average load is 30 an hour), so it shouldn't give me a huge performance hit, since only about 1/4 of that mail would even hit the first table lookup. I know a reasonable amount about designing not-horribly-inefficient queries.

I already have good monitoring in place, and I'd design it so that it could be enabled/disabled on a per-domain basis.

How expensive is a mySQL connection? I don't have a way to keep them around between emails, so I'm going to have to open/close one every mail. I don't stream_by_recipient, but...

What do people think of the idea in general? If I implement it, I'll put it on the wiki.

Tina Marie

Tags:
Current Mood: geeky geeky

1 comment or Leave a comment
Comments
(Deleted comment)
skywhisperer From: skywhisperer Date: July 27th, 2005 02:56 pm (UTC) (Link)
WOMH?
1 comment or Leave a comment