I just checked my server log between Dec 19, 2006 and today. SCode Plugin has successfully prevented 2140 comment spams, but there were 122 spams which slipped through the cracks. Those 122 provided correct answers from the CAPTCHA image. And judging from the timestamps, I’m pretty sure that those 122 were automated spam attacks using bots. However, considering that evil spammers would go as far as hiring freelancers to solve CAPTCHAs manually, there’s a chance that some people out there somewhere ‘unintentionally’ spammed my blog manually 122 times since Dec 2006. Ouch!
There have been many attempts to fight comment spams: nofollow, blacklist, whitelist, greylist, captcha, spam karma, akismet (the most popular nowadays with a small number of false positives), etc, etc. They succeeded to a certain degree, but why are the spambots still happily crawling around?
I would like to separate the 2 issues here: the first one is identifying spams (where akismet has been highly successful), the second one is identifying bots vs humans (this is where we are playing cat and mouse with the spammers).
If we eliminated the usefulness of a spambot, then we reduced the amount comment spams.
Spammers have been successful largely because of spambots. If they don’t have an automated way to spam, would they spam blog comments manually? Less likely. At least it should be a more costly process for them, and if it is costly enough maybe the value of spamming will ‘degrade’ even further. So I’m thinking if we can make the spammers think that bots are useless, then a very large portion of comment spam attempts will be reduced.
I’m proposing that we make each blog’s comment form unique enough by setting up a pair of static question and answer which would be different between blogs, so if there are 1000 blogs, there will be 1000 unique answers with unpatterned questions. By removing the common patterns in spam prevention efforts, we will reduce the functionality of a spambot.
The Q and As can be as simple as:
- Q: Please type abcdef, A: abcdef
- Q: What comes after 9, A: 10
- Q: Just type MONEY in lowercase, A: money
- Q: 77 times 2 is, A: 154
- Q: Your passkey is ‘homer’ (without the quotes), A: homer
- Q: Please fix this typo: hapipness, A: happiness
- Q: Type the word blue with a space between b and l, A: b lue
- Q: Do you think you’re cool? Just type yes, dude!, A: yes
It’s simple to set up, it’s simple to answer, and it would be hard enough for a bot to cater every question and answer combination in this world. This text-based approach is definitely more usable than CAPTCHA images, plus it should be easy enough to internationalize.
I just finished a Blojsom plugin that implements the above, installed the SNAPSHOT version about 20 minutes ago, and it has prevented 28 comment spams since then.
MyQAndA Plugin is a Blojsom plugin that allows you to specify your own static question and answer for commenters to key in. The idea is to make each blog requires a unique answer (varying answers between various blogs), which will then create a situation where there’s no problem solving to automate and the value of spambots will degrade.
I’m fully aware that this blog post itself might have comment spams by the time I wake up in the morning, but I can be certain that the spammer is a human.