This is a read-only archive!

Spam spam spam spam spammity spam

I woke up this morning to about 50 spam emails and some notifications from my host that my CPU usage was about 200% over the past four hours. Turns out spamd was going mental. Not sure what caused it but it seems to be working again after I restarted it.

One of the worst things about running your own mail server is spam. I don't much about how to do it properly. I have SpamAssassin running, I tweaked the settings and trained it well, and it works OK. Of 8,000 spams in the past week or two, I think only two made it through to my inbox. But I keep thinking there must be a better way.

For a while I tried greylisting. Greylisting means you pseudo-bounce every email you get, and force the mail server to resend it. Once it's resent, that server is added to a whitelist. The idea is that spam servers won't bother resending and genuine mail servers will.

I ran this way via Postgrey for a couple months. The good thing is that it works pretty much as advertised. I went from hundreds of spam emails per day, to fewer than a dozen. SpamAssassin caught all of those dozen and I never saw them. It was nice.

The problem with this, however, is twofold.

  1. All mail from people you've never heard from before is delayed 5-10 minutes. This is very annoying in certain circumstances, e.g. registering for an account at a new message board or buying something from an online store you never used before. I'd rather like to see the receipt or user registration right away. So to get around this I had to go add them to a whitelist on the server every time, which was ridiculous.

  2. Not all genuine mail servers bother resending after the temporary bounce, so you lose mail. You need only look in /etc/postgrey/whitelist_clients and see the enormous list of mail servers that Postgrey knows NOT to greylist, to be scared into never using Postgrey again. This includes yahoo.com, ebay.com, a bunch of airlines, and so on. The list goes back to 2005 and obviously is an incomplete list, since it only includes servers that people reported having problems with. I had to add gmail.com to it myself to avoid losing mail from my wife (domains that use large pools of mail servers will always be greylisted, it seems).

Losing mail is the reason I stopped using Postgrey. So I'm back to SpamAssassin alone and dealing with an occasional spam or two, while my spam inbox balloons.

October 17, 2009 @ 4:59 AM PDT
Cateogory: Linux
Tags: Spam, Email, Linux

2 Comments

Jason
Quoth Jason on October 17, 2009 @ 2:37 PM PDT

I stopped using SpamAssassin years ago, when it became obvious it was too much for my smallish VPS to deal with. At this point, I use a bunch of checks I've configured in Postfix, many of which I got from an article similar to this one.

I also block using zen.spamhaus.org, though I don't know how often that results in blocked legitimate e-mail. I certainly haven't noticed any. To wrap it up, I also use spamprobe with procmail on my personal maildir. However, 99% of the spam it has to deal with comes from accounts I pull with fetchmail, as the restrictions above seem to stop most spam that tries to come through directly.

Patrick
Quoth Patrick on October 17, 2009 @ 5:30 PM PDT

I'm using a well-trained bogofilter for years now, with very good results. Many people tell me, that a Bayesan-only filter doesn't get the job done, but for my idea of what is spam/ham it just works. And it does so really fast. So maybe this could be an alternative to your current SpamAssassin solution. What I really find useful is the tristate filtering - there is Spam, Unsure and Ham. You can tune the threshold values yourself (mine are 0.65,0.35). This way you can be quite sure that there will never be Spam in your inbox, but at the same time mails where the filter is not so sure that it's really Spam won't end up in the quickly growing Spam folder. I get around 2 to 10 mails in my Unsure folder per week. Nearly all of them are actually Spam, but I don't want to risk false positives (Ham going into the Spam folder), so I leave my thresholds as they are.