unCAPTCHA, an artificial intelligence-based automated system designed at the University of Maryland, can break Google’s audio-based reCAPTCHA challenges with an accuracy of 85%. Google has been working on refining and strengthening reCAPTCHA for years, a Turing test-based methodology for proving that website users aren’t robots, and recently extended it to mobile websites for Android users.
unCAPTCHA, to be fair, doesn’t address what most of us are familiar with: Challenges asking us to read distorted text and type it into a box. Instead, the AI is trained to crack audio challenges, which are offered as an option for people with disabilities.
unCaptcha combines free, public, online speech-to-text engines with a phonetic mapping technique. The system downloads the audio challenge, breaks it into several digital audio clips, then runs them through several text-to-speech systems to determine exact and near-homophones, weights the aggregated results by confidence level, and then sends the most probable answer back to Google.
The results of the trial showed that the AI could solve 450 reCAPTCHA challenges with an 85.15% accuracy in 5.42 seconds: That’s less time than it takes to listen to the challenge in the first place. The research work proves that bad actors don’t need significant resources to mount a large-scale successful attack on the reCaptcha system.
“Prior work has generally assumed that attackers against CAPTCHA systems are well-resourced,” the researchers said in a paper. “In particular, the standard threat model involves an attacker who can attack the CAPTCHA tens or hundreds of thousands of times for a relatively small number of successes, and can scale this attack to abuse services.”
They added, “An attacker with many resources can afford a lower success rate, and thus some have argued that even a success rate of 1/10,000 is sufficient to threaten the integrity of services. In our work, we will assume an attacker with limited resources; unlike previous works attacking captchas, our threat model limits the attacker to one computer, one IP address, a small amount of RAM and limited training data (less than 100MB). Therefore, we aim for accuracy benchmarks above 50%, as a low-resource attacker cannot afford a lower percentage of success.”
Source : infosecurity-magazine.com