<< 17 January 2007 | Home | 19 January 2007 >>

Killing Blog Spam without Captcha

There are a few alternatives to Captcha - there is a nice trick that Damien Katz recently blogged about. It's ultimately doomed, because a spammer can easily adapt, but it is neat and it's working for now.

I've been mulling over a couple of ideas with Geert who was thinking of implementing the first for his Paste Bin site.

1. Use unpredictable form field names

When the user clicks on the "add comment" button, a comment form is presented with the usual fields: Name, Email, Website and Comment. The form fields however do not have predictable names. Instead, the fields are generated from some user-derived secret - maybe something stored in the users session. The important thing is that the field names are predictable to the server, but unpredictable to the user.

The server can then easily work out which field values relate to which input field, but the only thing you have to go on as far as HTTP stream goes is visually what labels are next to what fields, and there are plenty of ways to obscure this in HTTP. You can even alter the field ordering at runtime to make life even harder.

In order to submit the spam now, the spamming tool will need to manage cookies, visit a set of pages, and then guess which field names relate to which fields. This could be used in conjunction with fields hidden using CSS to make the guesswork near impossible.

2. Process the form with JavaScript

If the onsubmit process requires some JavaScript computation then the spammer will need to include a JavaScript engine in their toolkit for the post to be accepted. While it's possible that spammers could do this, maybe we can use this to slow them down to the point where it becomes less of a problem.

We dynamically create a short bit of Javascript that asks the posters browser to compute the factorials of some small prime. We can adjust the size of the prime to dictate the length of the required computation. The computation can be happening in the background while the post is being written.

I suspect that this method is slightly flawed because spammers will be using some bot-net to perform the attack, so slowing the submit rate down by 2 or 3 orders of magnitude isn't a big deal, they'll just use bigger bot-nets.

However the annoyance of having to include a Javascript engine in the spamming toolkit could easily stop all but the most persistent spammers.