reCAPTCHA: distributed book digitization while fighting spam

recaptcha_20070524.jpg
Thanks to spammers, we now are forced waste a substantial portion of time every day, typing in obfuscated wiggly letters to prove we are human. reCATPCHA is a slick idea for using the CAPTCHA system for doing something productive (...besides distinguising between homo sapien and homo computatralis).

With reCAPTCHA, the user is given two words, one known by the system and one from a book that previously failed character recognition. When the user enters both words, the sytem verifies the known word, proving human-ness, and submits the second word to a central database, which helps digitze books from the Internet Archive. With 60 million CAPTCHAs being solved every day, this could be a huge assist for portions of text that can't be handled by optical character regognition techniques. [via] Link

Related:
Negative CAPTCHA

Posted by Jason Striegel | May 24, 2007 10:10 PM
Blogging, Cryptography, Data, Language, Web | Permalink | Comments (0) Bookmark and Share

Recent Entries

Comments

Newest comments listed first.

Leave a comment



Bloggers

Welcome to the Hacks Blog!

Brian Jepson.Brian Jepson


Jason Striegel.Jason Striegel


Philip Torrone.Phillip Torrone



See all of the books in the Hacks Series!
Advertise here.

Recent Posts

www.flickr.com
photos in Hacks More photos in Hacks