Friday, September 30, 2011

Captcha and few variants...

I'm downloading some stuff from the Internet, and as a part of that process I had to solve captcha in order to prove that I'm a human. Captcha can be thought of as a puzzle that have to be solved and, by assumption, only a human can solve it. But, the reality is that there are automated ways that can solve capthcas, particularly badly designed ones. So, I was thinking a bit about captcha and decided to write about it...

There are several way of circumventing captcha. Two I heard of a very interesting. The first one, and the older one, is that some sites that host piracy materials or sexual materials, require you to enter chaptcha prior to accessing materials. But that captcha is from another site that is being abused. Let me provide a simple hypothetical example. Suppose that spammer wants to register as much mail addresses as posible with gmail. Gmail actually has protection in the form of captcha that is aimed at just that, preventing mass registrations. So, what spammer does is that he starts to provide some service to users, e.g. download of pornography. But, in order for the user to download something it has to first solve captcha, and the captcha to be solved is the one presented by GMail to spammer, which is redirected to the user.

The other form of captcha is even more bizarre. There are companies in India and China that employ humans that manually solve captchas. You are provided with API through with you send request, this request is routed to some human that solves it and sends back results. What a combination of automation!? And cheap one while we are at that, few dollars for thousands captchas, something like that.

So, what can be done? Well, there is a reload button on captcha that allows you to request another puzzle, so, there could be a sentence that requires you to reload and if you try to enter that particular captcha, you are banned. This would help for two cases. The first ones are automated recognitions that actually don't understand what's written in the captcha. The case when humans solve captcha could be restricted by localizations. Namely, if you require reloading you would present that in, e.g. Croatian because request comes from Croatia. But, if that someone sends then captcha to India, the gay there wouldn't know the meaning of the sentence and so wouldn't be able to solve captcha. Another possibility is to give a sentence that requires you to enter only third word, or to choose synonymous for a given word between several words.

Bringing this idea to a higher level would mean that apart from requiring user to retype what's written in captcha it would also require him/her to understand what's written in there and to do some particular action based on that!

In the end, this isn't perfect solution, but only a step in a play of catch between cat and mice which, for a short period of time, gives advantage to mice (or cat, depending on the view)...

