A CAPTCHA is a type of challenge-response test used in computing to determine whether the user is human. The process involves one computer (a server) asking a user to complete a simple test which the computer is able to generate and grade. Because computers are unable to solve the CAPTCHA, any user entering a correct solution is presumed to be human. A common type of CAPTCHA requires that the user type the letters of a distorted image, sometimes with the addition of an obscured sequence of letters or digits that appears on the screen.
One of the strongest and most difficult CAPTCHAs to crack is used by Yahoo which ulilizes a mix of blended alpha numeric characters as show below.
Now, a team of Russian hackers have apparantly found a way to read this Yahoo CAPTCHA with 35% accuracy.
The Russian hackers had this to say about the Yahoo! CAPTCHA:
“The CAPTCHA has a vulnerability we’ll discuss later. It’s not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100.000 tries per day, taking into the consideration the price of not automated recognition – one cent per one CAPTCHA.”
– which seems a plausible conclusion. The researchers can be contacted on this address: NetworkSecurityResearch[at]gmail[dot]com. The released software package shows us some inside techniques, the implementation of yahoo CAPTCHA recognition engine can be found here:
First project (server) needs MATLAB 2007a Compiler Runtime (MCR) installed. It waits for a connection and receives CAPTCHA, after that it sends recognized CAPTCHA text string back to client. Client reads jpg-files in test1 directory and sends them one by one to the server located on the same machine.
There are quite a few ways to defeat CAPTCHAs and this significant improvement in character recognition software could quite possibly be the knockout punch to using CAPTCHAs to defeat automated bots. Sometimes low-paid entry workers are also employed to defeat CAPTCHAs in bulk. Check out the Will Solve CAPTCHA for Money on SlashDot.
How to develop a Good CAPTCHA according to Jeremiah Grossman
1. Test should be administered where the human and the server are remote over the network.
2. Test should be simple for humans to pass. Humans should fail less than 0.1% on the first attempt.
3. Test should be solvable by humans in less than a several seconds.
4. Test should only be solvable by the human to which it was presented.
5. Test should be hard for computer to pass. Correctly guessing the answer should be less than 1 in 1,000,000, even after 24-hours of analysis.
6. Knowledge of previous test questions, answers, results, or combination thereof should not impact the predictability of following tests.
7. Test should not discriminate against humans with visual or hearing impairments.
8. Test should not possess a geographic, cultural, or language bias.