9 Posts Tagged 'Spam' RSS

Ads on license plates?

What if when your car stops at a red light, your license plate displays ad banners? What could possibly go wrong?

Quoth the person(?) who wrote this bill:

"We're just trying to find creative ways of generating additional revenues," he said. "It's an exciting marriage of technology with need, and an opportunity to keep California in the forefront."

The forefront of annoying the hell out of people. Certainly what I need is more distractions on the road. I mean, what if there's a new brand of toothpaste and I didn't find out yet? Someone somewhere needs to earn a dime for telling me about it by any means necessary.

I'm just waiting for the first company to propose paying new parents a few hundred dollars to tattoo ads on their babies.

June 20, 2010 @ 10:09 PM PDT
Cateogory: Rants

Printer spam: what could possibly go wrong?

As further evidence that there are no depths to which companies won't stoop when it comes to advertising, HP has come up with a great idea: Get people to hook their printers up to the internet and then spew advertisements out of their printers.

Well, it's a win-win situation for the companies doing the advertising: Not only will people see your ads, they'll pay for the ink and paper to print them. Maybe not such a great situation for the end-user though.

And then there are the privacy implications of targeting ads based on geolocating the IP address of the printer. Which I find a bit disturbing, but I guess advertisers already do that with online ads. But wait, there's more:

Ads can also be targeted based on a user's behavior as well as the content, said Vyomesh Joshi, head of the HP's Imaging and Printing Group.

Looking at what I'm printing so you can try to sell me things? Just a bit creepy.

Most troubling to me is the intrusiveness of the whole thing. They're taking control of a physical object in my house and using it against me. May as well kidnap my cat and train him to spell out "BUY PEPSI" in his cat litter.

Quote some slimeball at HP:

"What we discovered is that people were not bothered by it [an advertisement]," Nigro said. "Part of it I think our belief is you're used to it. You're used to seeing things with ads."

Translation: "We know this is a really horrible idea, but if people are complacent enough to sit there and take it without complaint, what's stopping us?"

He's right though, people are used to it.

I guess TV, radio, internet, phones, product placement in movies and games, print media, billboards and the postal service just aren't enough. Clearly what the world really needs is another ad-delivery mechanism.

June 17, 2010 @ 10:59 AM PDT
Cateogory: Rants

Lame comment spam management that works

It's been nine months since I ditched Wordpress and moved to a blog system I wrote from scratch (in Clojure). This was a great move in so many ways. One of those ways is comment spam. My site is as popular now (or maybe slightly more popular now) as it was when I was running Wordpress, so I think comparing before and after is valid.

With Wordpress, every morning I'd do the ritual of deleting overnight spambot droppings. Typically I got between 1 and 5 every night. I had a default Wordpress install and all I used for spam filtering was Askimet. Askimet did a surprisingly good job, catching literally if not thousands of spams every week which otherwise would've been ruining my site. But inevitably some would still get through. And what's worse, there were a lot more false positives than I could tolerate.

Since I started counting with my new system, which is around 6 months, to the best of my knowledge I've gotten zero spambot-produced comments that made it through my filters. This is pleasant, to say the least.

The system I'm using is stupid. None of it is stuff I thought of myself, I got ideas from other lots of other blogs or articles I read, but the implementation is mine and it's not sophisticated. It would take a bot author a few seconds to work around it. But no one has bothered. Why bother writing a bot for my one-man blog, when you can write a bot for Wordpress and have it work on tens of thousands of blogs? And I can change my system to defeat the bots with a few lines of code just as easily as they can work around it.

So here's why I think it's working.

December 06, 2009 @ 2:34 AM PST
Cateogory: Programming

Spam spam spam spam spammity spam

I woke up this morning to about 50 spam emails and some notifications from my host that my CPU usage was about 200% over the past four hours. Turns out spamd was going mental. Not sure what caused it but it seems to be working again after I restarted it.

One of the worst things about running your own mail server is spam. I don't much about how to do it properly. I have SpamAssassin running, I tweaked the settings and trained it well, and it works OK. Of 8,000 spams in the past week or two, I think only two made it through to my inbox. But I keep thinking there must be a better way.

For a while I tried greylisting. Greylisting means you pseudo-bounce every email you get, and force the mail server to resend it. Once it's resent, that server is added to a whitelist. The idea is that spam servers won't bother resending and genuine mail servers will.

I ran this way via Postgrey for a couple months. The good thing is that it works pretty much as advertised. I went from hundreds of spam emails per day, to fewer than a dozen. SpamAssassin caught all of those dozen and I never saw them. It was nice.

The problem with this, however, is twofold.

  1. All mail from people you've never heard from before is delayed 5-10 minutes. This is very annoying in certain circumstances, e.g. registering for an account at a new message board or buying something from an online store you never used before. I'd rather like to see the receipt or user registration right away. So to get around this I had to go add them to a whitelist on the server every time, which was ridiculous.

  2. Not all genuine mail servers bother resending after the temporary bounce, so you lose mail. You need only look in /etc/postgrey/whitelist_clients and see the enormous list of mail servers that Postgrey knows NOT to greylist, to be scared into never using Postgrey again. This includes yahoo.com, ebay.com, a bunch of airlines, and so on. The list goes back to 2005 and obviously is an incomplete list, since it only includes servers that people reported having problems with. I had to add gmail.com to it myself to avoid losing mail from my wife (domains that use large pools of mail servers will always be greylisted, it seems).

Losing mail is the reason I stopped using Postgrey. So I'm back to SpamAssassin alone and dealing with an occasional spam or two, while my spam inbox balloons.

October 17, 2009 @ 11:59 AM PDT
Cateogory: Linux
Tags: Spam, Email, Linux

Anti-spam field still holding

So far my silly anti-spam measures are working. Since last week I've had 1861 spam comment attempts, of which 0 were successful. 1857 of them didn't even alter the text my the captcha text field at all. Four of them inexplicably HTML-escaped the < into a &lt;.

One feature I didn't implement from Wordpress is subscribing to comments via email. Sending an email from Java is possible but a little bit painful to implement. The Javamail API is a monster.

I do think it's useful to be able to know when someone responds to comment you left, but is spamming your inbox really the best way? I have to think there's a better way.

I did implement an RSS feed for each individual post's comments. And separate RSS feeds for all the tags on my blog, and all the categories. When RSS feeds are generated dynamically, why not? This is all of the code for the tag feeds:

(defn tag-rss [tagname]
  (if-let [tag (get-tag tagname)]
    (rss
        (str "briancarper.net Tag: " (:name tag))
        (str "http://briancarper.net/" (:url tag))
        "briancarper.net"
        (map rss-item (take 25 (all-posts-with-tag tag))))
    (error-404 )))

Plus the routing code:

(GET "/feed/tag/:name" (tag-rss (route :name)))

But I haven't uploaded the comment-feed feature because I don't know if it's overkill. Personally I am liberal with my RSS feeds, I just pop them into my Akregator and off I go. But I don't know if other people take their feeds more seriously, or what. RSS feeds can be a bit heavyweight. Maybe I should make a feed for all of my comments across all posts.

This post is related to Darn you, spammers.
March 23, 2009 @ 7:38 PM PDT
Cateogory: Programming

Blog is still going strong

After I implemented that silly CAPTCHA yesterday, the spam was stopped. There's also a honeypot form field (it's hidden via CSS so humans don't know it's there, and if any bot POSTs text for that field, the data is rejected automatically). It's silly and easily defeated, yet it stopped all 262 spam attempts since yesterday. It looks like all the spam is for one site, but it's coming from a huge range of IPs. So it's probably a botnet. Thanks, MS Windows!

I rewrote my whole CRUD layer so that I could use it for more than one database at once, and then rewrote my gallery code to take advantage, and now two hours later I have my origami gallery back up and running. Both sites are running from the same JVM. I wonder how many sites I can have going at once before the server melts into a puddle of Java-inflicted goo.

  PID PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11338 16   0  512m 128m  12m S    0  0.3   0:28.33 java

Good thing I have plenty of RAM on the server. From looking at before and after shots of the memory usage, 66 MB is the JVM itself, and 40MB more is Jetty and Compojure and my code and all the dependencies. Then the last ~20 MB or so is my database slurped into RAM. So I can probably fit another few tens of thousands of posts and comments in here before I have to worry much. The real test will be letting this thing run for a couple weeks and see how hard it leaks.

This post is related to Darn you, spammers.
March 18, 2009 @ 10:01 PM PDT
Cateogory: Programming

Fun with HTTP headers

One fun thing about playing with Compojure is that it doesn't do much with HTTP headers for you, which is a good learning opportunity. RFC 2616 is rather helpful here.

For example I learned that if you don't set a Cache-Control or Expires header, your browser will happily re-fetch files over and over, which is a bit of performance hit. Static files that don't change often like images etc. can be set with a higher Expires value so they're cached.

Another thing to keep in mind (note to self) is that using mod_proxy to forward traffic to a local Jetty server means that the "remote IP" you get from (.getRemoteAddr request) will always be 127.0.0.1. If you want the user's real remote IP, you have to look in the X-Forwarded-For header (easily accessed as (:x-forwarded-for headers) in Compojure. Given that Identicons are generated from a hash of an IP address, this has resulted in some screwed up (wrongly identical) avatars for a bunch of people in posts for the past couple days. Oops. Not much I can do to fix that now.

In other non-news, I just the spam logging for the blog so I can see the kinds of things bots are doing to get around my feeble anti-spam measures. Sadly the spam seems to have stopped entirely, right after I set this up. How annoying.

This post is related to Darn you, spammers.
March 17, 2009 @ 10:14 PM PDT
Cateogory: Programming

Darn you, spammers.

I was in a rush to get this darn blog finally done, so I threw some stupid anti-spam measures on here. Namely, the comment form included 20 textareas, 19 of which were display: hidden and one of which was randomly the right one, and any text in the hidden ones would cause the comment posting to fail.

It only took a spam bot 48 hours to figure this out, I guess, because the last hour I've been hammered. So I implemented a CAPTCHA as another short-term holdover until I can code up something good. At least it immediately stopped this spam bot whose crap I've been deleting for the past hour.

Hopefully this isn't too intrusive. I think it fits the site fairly well, as you will probably agree once you see it.

This post is related to Clojure 1, PHP 0
March 17, 2009 @ 6:35 PM PDT
Cateogory: Programming

Email woes

I own my own domain (or five) and one of the good things about that is having nearly infinitely many email accounts if you want them. So I tend to make up a new account for every site I register at. This leads to amusing things like getting an email from a marketing firm asking me to complete a survey for an airline "who wants to remain STRICTLY ANONYMOUS". Sent of course to UNITED@briancarper.net. Oops.

Because of laziness I set up a catchall account on my domain so every email sent to anything @briancarper.net would be sent to me. This was such a horribly bad idea, I'm unsure how I lasted for a couple of years this way. I was getting about a few hundred spam emails per day. Amazingly spamassassin + Thuderbird's junk mail filter caught almost every single one of them to the point where I hardly even noticed. Spam filters can be bad in the same way pain killers can be bad. They don't solve a problem, they only mask the pain so you can ignore the problem.

So I decided to stop using a catchall. Problem is that I already have around a hundred email addresses I've used for various message boards and companies and friends and family, and there's no way I'm going around to change them all. So I decided to just get a list of them all and set up a big list of postfix aliases for now.

So, I downloaded my whole email account in mbox format and wrote a Ruby script to crawl it and make a list of all the email accounts I've ever received mail from. Thank you Linux mailserver for storing email sanely in plaintext. Luckily for me, I haven't deleted any emails from my server since 2005; so my generated list of emails is likely to be pretty complete. It pays to be obsessive sometimes.

Even a braindead brute-force Ruby script is fast enough to do this. Took a minute or two to scan 200MB of plaintext.

#!/usr/bin/ruby
require 'find'
found = {}
Find.find(ARGV[0]) do |fn|
  next unless File.file? fn
  File.read(fn).scan(/[A-Za-z0-9_-]+@briancarper.net/) do |email|
    next if found[email]
    found[email] = true
    puts email
  end
end
April 18, 2008 @ 8:49 PM PDT
Cateogory: Programming
Tags: Spam, Email, Ruby, Linux