ReCAPTCHA 2.0

Published: 2014 Dec 31 by Avi Deitcher

In the first half of this year, I noted that ReCAPTCHA was a lot like the "TSA of the Web" - an annoyance that is sometimes necessary to keep bad actors out and good (or, in the case of ReCAPTCHA, "real") actors in. I also noted that Google, itself, had publicized that it had broken ReCAPTCHA, rather than wait for someone else to do so. In that respect, ReCAPTCHA was lot more like the TSA - weak, broken, but good "security theatre" - than we thought.

In the last few weeks, Google released an entirely new version of ReCAPTCHA. For simplicity's sake, we will call it ReCAPTCHA 2.0 or just R2 (and perhaps the next version will be D2). You may already have seen or used it. While the goal remains the same - identify humans while filtering out robots - the method is entirely different. Gone are the strangely distorted letters and numbers that (supposedly, at least initially) only humans can see. Instead you just check the box, and most of the time R2 can tell if you are a human.

On the one hand, this is a great improvement. It is far easier to just "check the box" than to try and discern those letters that oftentimes aren't clear even to humans. ReCAPTCHA's mission is twofold:

Let humans in while keeping robots out
Provide the minimum of disruption.

The latter is a key element. Unlike with airport security, where there is a monopoly, if the disruption caused by ReCAPTCHA is too great, Web owners simply will not use it. In the choices between too many robots let in versus too many humans annoyed, they will pick too many robots every single time. After all, without enough humans, the Web business fails.

Of course, Web owners have that luxury. In the airport business, no one will pick too many bad guys over not enough legitimate fliers!

However, in shifting to R2, Google has also moved into the secrecy space. With the first ReCAPTCHA, there never was any secret. Everyone understood perfectly how the security works; it relied on the simple fact that no robot was accurate enough (at least initially) to read one of those distorted images. The algorithm was open and out there.

With R2, they are relying on some series of patterns, movements or timing that they are keeping secret. After all, if the R2 software can detect those patterns, then anyone can design software to mimic that pattern. In other words, we are now in the oft-discredited realm of "security by obscurity". R2 depends entirely on keeping its algorithms secret!

Inevitably, the algorithms will come out, and probably sooner rather than later. Whether it is via a leak from one of their people; a North Korean or Chinese government or Russian individual hacker; or just by someone putting together enough bots to try enough R2 instances to detect the pattern; eventually the cat will be out of the bag.

R2 is simpler and easier to use, and probably, at least initially, more secure than the existing version. For now, that is all that matters. But the design of the new system practically begs to be broken.