Archive for the ‘Spamity spam, spamity spam...’ Category

A new kind of CAPTCHA?

Wednesday, March 5th, 2008

CAPTCHAs, the automated tests that are meant to prevent spam-bots from overrunning free e-mail services and comment forms, have been defeated. What’s next? Hopefully not something like this… I tried it three times, and only succeeded twice.

Statistics also indicate that it wouldn’t work too well — there are less than 30 variations, so even randomly choosing three of the pictures would result in a greater than 3% success rate, which is likely sufficient for spammers. And if there are only a set number of pictures to choose from, it would be easy to have a human classify them the first time they’re seen, and the computer to remember the classification (several of the same pictures came up multiple times while I was trying it). And I could think of fairly simple algorithms that would bump up the accuracy. It’s a tough problem.

When you run into a tough problem in math, it usually pays to see whether parts of it can be transformed into an easier one. For example: using public-key technology, it’s easy to confirm whether two digital signatures refer to the same ID or not, without giving out the information needed to duplicate that ID. Find some way to verify someone initially, and public-key technology could be used to confirm that it’s the same person later. It’s not foolproof (the private part of the key data could still be stolen), but it would be a major pain in the spammers’ tails.

Of course, all of this may become moot as soon as someone develops a true learning AI, one that could solve such a problem as well as a human. That kind of AI could easily be used by the bad guys to get around any CAPTCHA problem, but it could also easily identify and delete spam messages with an extremely high accuracy level. I suspect that the latter would more than make up for the former, because there are only a limited number of ways that you can write a message advertising something.

It’s a very interesting subject, all around.

ThunderBayes Report

Friday, February 29th, 2008

After one month of using it, I’m very happy to confirm that SpamBayes (via the ThunderBayes plug-in for Thunderbird) is absolutely as awesome as I’d heard it could be!

The scoring is so accurate that I decided a while back to let it automatically delete messages with 100% spam probability as read, so that I didn’t have to look at them at all — that’s the majority of the spam I get. Anything with a lesser spam probability, or anything it’s uncertain about, is moved to a junk folder but is left unread; I look it over to confirm that it’s really spam before getting rid of it.

The statistics for the last two weeks: the majority (83%) of spam messages I received scored 100 and were auto-deleted. 11% of spam messages scored over 90 and were marked as spam and left for my confirmation, and the remaining 6% were marked uncertain. And there were no false positives, or even false negatives.

It’s not perfect, but it’s far better than anything else I’ve used. :-)

Fowarded E-Mails

Wednesday, February 13th, 2008

One of my sisters sent me a very interesting optical-illusion e-mail recently. I enjoyed it quite a bit. But like most such messages that I see, it was forwarded, and had several different headers, and all of the e-mail addresses it had ever been sent to were intact and visible in them.

People… please, please, please don’t make it easier for spammers to get the e-mail addresses of your friends and family. If you really must forward a message, or send out a mass e-mail of your own, do it the smart way: use the BCC field instead of the TO or CC fields, and strip off any previous forwarding headers before you send it.

I’m tickled pink about ThunderBayes!

Sunday, January 27th, 2008

After several days of using ThunderBayes/SpamBayes, I’m happy to report that it’s just as awesome as it was rumored to be! :-D

Even better, I’ve been able to fix one of the problems that I had with setting it up (the multiple-accounts bug). I sent the code changes to Daniel Miller, the original ThunderBayes developer, and we’ve been discussing them. I suspect there will be an official update to ThunderBayes in the near future.
(more…)

It works!

Thursday, January 24th, 2008

As reported yesterday, I’m now using ThunderBayes/SpamBayes to filter spam. I manually classified several hundred recent spam messages, and a roughly-equal number of recent personal (”ham”) messages. So far, it hasn’t had a single false positive on either side, and most of the “unsure” classifications have been of legitimate commercial messages (that do resemble spam in many respects).

I turned on the “evidence” option in SpamBayes, so that I can see what it used to make a determination when I look at the headers for a message. It’s interesting… it quickly picked up the usually-reliable spam words (”only”, “longer”, “erections”, and “embarrassed”, for instance), but some of the others it’s coming up with are surprising… “government” gets a 0.97 spam probability — it apparently showed up in six of the spam messages I trained it on, and none of the hams. “charset:windows-1252″ gets a 0.91 (63 spams, 6 hams), probably because most mail from my legitimate acquaintances is written either on Linux or on alternative mail programs under Windows. It also picked up on one of my e-mail addresses — a good portion of my spam comes in on that address.

Lots of fun all around. :-)

Getting SpamBayes/ThunderBayes Working (Under Linux)

Wednesday, January 23rd, 2008

Thunderbird’s built-in spam filter is pretty good, more accurate (and a lot easier to set up and use) than several others I’ve tried, but even so it’s accuracy still leaves something to be desired. I don’t get anywhere near as much spam now as I used to, but roughly half of my daily e-mail is still spam, so I’m always on the lookout for potential improvements to my filtering system.

Though I’ve long wanted to use SpamBayes, an open-source project that’s reputedly one of the most accurate anti-spam systems going, reading through the daunting setup instructions has always deterred me. They seemed to imply that procmail or some other arcane server-type mail system was required for it. But while poking around the Internet today, doing research for another project, I discovered that there’s a Thunderbird plug-in for it, ThunderBayes! I immediately tried to install it.
(more…)

“Transactions specialist – flexi-time contract hire”

Tuesday, December 18th, 2007

I’ve pretty much given up posting the spam and scam messages I receive here, because there’s very little new in them, just the same tired rehashes of the same tired fake offers. But this one is notable because it’s one of the few actual “money mule” e-mails I’ve ever gotten.

I know I shouldn’t have to say this, but if you ever get a message like this, don’t be tempted. You’re being asked to take stolen or scammed money from people and send it to the thieves, and when the police go looking, it’ll be you — the “mule” — who they charge.

Note that, like pretty much all spam, this was mass-produced and sent to lots of different e-mail accounts, which is obvious because the address in the to-line only has the same first few letters as one of my addresses. It was obviously sent to an alphabetical list of target addresses.

After much consideration, I’ve decided to leave the entire message unchanged, including the e-mail address, because someone might find this entry by searching on that. Again, don’t be tempted to contact this scammer!
(more…)

Chain E-Mails

Thursday, December 13th, 2007

In 1997 I set up an e-mail account to support several software products I’d written for my (now-former) company. Because customers and potential customers needed to be able to find it easily, I put it on the company web page too. Predictably, it became an instant spam magnet. I couldn’t trust automated spam filters (when dealing with customer e-mails, even a single false positive is unacceptable), so I went through every message and manually decided whether it was spam or not.

Before I turned the account over to a co-worker in 2004, it was getting several hundred spam messages each day. As you may imagine, going through these on a daily basis (weekends included) got very old.

So when my mother, my sisters, and one of my aunts each got e-mail access, and started forwarding good-luck/bad-luck/money chain e-mails and ridiculous hoaxes to me, I didn’t have a lot of patience with them.
(more…)

Weird Auto-Blogs

Sunday, October 28th, 2007

No, I’m not talking about blogs about cars. I’m talking about pseudo-blogs that are obviously completely automated, collecting entries about certain subjects from other blogs and linking to them. Geek Drivel has had three such hits so far; I’m letting them stay linked for now, while I try to figure out what their purpose is.

I’m certain it’s some form of link-spamming, and I suspect I know how it works too: one spammer or group of spammers has gotten wise to the fact that their link-spams aren’t getting through because it’s trivial for link-spam checkers to ensure that the link goes to an article that links back to it. So they’ve created these pseudo-blog sites to collect links, using actual blog software that’s automated to quote some piece of the article that seems sort of relevant to the title of the blog. That mollifies the automated link-spam checkers, and passes a cursory manual examination as well, getting the links in place. Then at some point, they’ll replace the pseudo-blogs with some penis-pill or stock-scam site, and have lots of ready-made links to it to boost it’s Google ranking.

We’ll see if I’m right. :-)

“Russian spammer murder hoax exposed”

Sunday, October 21st, 2007

Darn. I was hoping it was true, and apparently I wasn’t alone. Maybe the Mafia’s PR department spread the story. ;-)