Myqanda, Possibly A Simple Anti-Bot Solution To Fight Coment Spams?

I just checked my server log between Dec 19, 2006 and today. SCode Plugin has successfully prevented 2140 comment spams, but there were 122 spams which slipped through the cracks. Those 122 provided correct answers from the CAPTCHA image. And judging from the timestamps, I’m pretty sure that those 122 were automated spam attacks using bots. However, considering that evil spammers would go as far as hiring freelancers to solve CAPTCHAs manually, there’s a chance that some people out there somewhere ‘unintentionally’ spammed my blog manually 122 times since Dec 2006. Ouch!

There have been many attempts to fight comment spams: nofollow, blacklist, whitelist, greylist, captcha, spam karma, akismet (the most popular nowadays with a small number of false positives), etc, etc. They succeeded to a certain degree, but why are the spambots still happily crawling around?

I would like to separate the 2 issues here: the first one is identifying spams (where akismet has been highly successful), the second one is identifying bots vs humans (this is where we are playing cat and mouse with the spammers).

If we eliminated the usefulness of a spambot, then we reduced the amount comment spams.

Spammers have been successful largely because of spambots. If they don’t have an automated way to spam, would they spam blog comments manually? Less likely. At least it should be a more costly process for them, and if it is costly enough maybe the value of spamming will ‘degrade’ even further. So I’m thinking if we can make the spammers think that bots are useless, then a very large portion of comment spam attempts will be reduced.

I’m proposing that we make each blog’s comment form unique enough by setting up a pair of static question and answer which would be different between blogs, so if there are 1000 blogs, there will be 1000 unique answers with unpatterned questions. By removing the common patterns in spam prevention efforts, we will reduce the functionality of a spambot.

The Q and As can be as simple as:

  • Q: Please type abcdef, A: abcdef
  • Q: What comes after 9, A: 10
  • Q: Just type MONEY in lowercase, A: money
  • Q: 77 times 2 is, A: 154
  • Q: Your passkey is ‘homer’ (without the quotes), A: homer
  • Q: Please fix this typo: hapipness, A: happiness
  • Q: Type the word blue with a space between b and l, A: b lue
  • Q: Do you think you’re cool? Just type yes, dude!, A: yes

It’s simple to set up, it’s simple to answer, and it would be hard enough for a bot to cater every question and answer combination in this world. This text-based approach is definitely more usable than CAPTCHA images, plus it should be easy enough to internationalize.

I just finished a Blojsom plugin that implements the above, installed the SNAPSHOT version about 20 minutes ago, and it has prevented 28 comment spams since then.

MyQAndA Plugin is a Blojsom plugin that allows you to specify your own static question and answer for commenters to key in. The idea is to make each blog requires a unique answer (varying answers between various blogs), which will then create a situation where there’s no problem solving to automate and the value of spambots will degrade.

I’m fully aware that this blog post itself might have comment spams by the time I wake up in the morning, but I can be certain that the spammer is a human.

About these ads

17 thoughts on “Myqanda, Possibly A Simple Anti-Bot Solution To Fight Coment Spams?

  1. I think this is a very interesting notion. It seems to me that it would rely on a relatively large pool of challenges, and that lazy admins don’t copy others’ pools. It occurs to me that it would be possible to provide a collaborative pool that served up random challenges via a web service.

  2. My initial thought was actually on ‘kindly forcing’ each blog to specify a pair of static Q and A which will push the uniqueness of each blog’s comment form (hence the name myqanda, i.e. My Q and A). But come to think of it, having a shared challenge pool would be handy for an environment with multiple blogs on a single blog software installation.

    The idea of having a ‘global’ challenge pool (via a simple REST interface?) is even more interesting, but it’s not something I can pursue given my limited resources. It would be cool if some funky web company out there implements this service.

    Specific to Blojsom implementation, Myqanda Plugin can easily support:
    1) personal challenges (specific for each blog): configurable via blog properties
    2) shared internal challenges (shared by blogs within the same Blojsom installation): configurable via bean declaration

    And maybe in the future:
    3) shared external challenges: pulled from public web services

  3. Sounds like a good aproach. Well, as long as the pool of challenges is big enough and does change the exact wording of the question often enough…


  4. This approach has also been successful in the WordPress world (wp-gatekeeper plugin) and I like it much better than CAPTCHA. One catch: This only works for comments, not trackbacks. Also if the question isn’t unique enough, it’ll be trivial for spammers to attack it. (Much easier than decoding CAPTCHA, at least.)

  5. Great to hear that the approach has been successful for WordPress.
    I might rename ‘My Q and A Plugin’ to ‘GateKeeper Plugin’ :). I’m really really bad with names.

    Agreed with the fact that the approach won’t work for trackback. However, something like WordPress Trackback Validator plugin might be good enough in terms that it’s not something the spammer would want to do for all trackback spams.

    I implemented similar method for Blojsom called TrackbackKeyword Plugin which checks for the existence of some keywords on the trackback page, so it doesn’t necessarily require a link back to the post.

  6. I guess when it comes to hacking, the same argument could be applied to the blog server itself. It’s up to each party to come up with enough security.

    Re compromising the pool data, that’s the reason why it’s better to have many pools with few challenges, rather than having fewer pools with many challenges.

    I think it would be better if the spammers can’t even be bothered to compromise that many pools because it will just be a waste of time for them.

    If every blog provides its own unique challenge, it will definitely be too costly for spammers. Well, until the day they come up with A.I. spambot, then it will be another ball game.

  7. Lots of spam blockers start out really great, but those bots and spammers always find ways around them. It is a constant battle to stay one step ahead so that your posts do not get filled up with the nonsense.

  8. I agree that it’s going to be a constant battle. Reading what I wrote almost 2 years ago, now, I think that the key is neither on identifying the spams nor identifying the bots. The most important thing is on how to enable the various spam-fighting technologies to the masses.

    I’m also in favor of having a bounty system for whoever can put those evil spammers in jail :p.

  9. Nice post. I totally agree with you. Spam comments are a real headache. There are plugins these days that block/delete spam comments. But the “question and answer” method is very useful. Simple but does the trick. Thanks for the information.

  10. It never seems to fail that whenever you find a way to stop people from doing something like spamming, they always end up finding a way around it.

    I have to admit though that the incidence of spam seems to be declining. At least it is for me anyway.

    Good info.

  11. Hi Cliffano,

    I noticed that this post was written a couple of years ago and that the theory behind your approach was a good one, did you ever put it to the test and if so how did it go. I’m creating a blog at the moment, which in itself takes up a great deal of time, the last thing I want to be doing is removing robot spam all the time too. Regards John

  12. John,

    I implemented it only as a Blojsom plugin with blog and installation pool of challenges, which worked very well. I didn’t get the chance to implement a public pool.

    I’ve since moved to WordPress, and I’ve been quiet happy with Akismet.

    When I have some spare time, I really want to revisit the idea of public pool as a service and attempt an implementation of it.

  13. Cheers for replying, we have decided to go with wordpress ourselves so any hints and tips would be welcomed gratefully.

    Thanks Again

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s