• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

ReportersLab Reviews FromThePage

October 3, 2012 by Ben Brumfield

Tyler Dukes has written a concise introduction to the issues with handwritten material and a lovely review of FromThePage at ReportersLab:

Even when physical documents are converted into digital format, subtle inconsistencies in handwriting prove too much for optical character recognition software. The best computer scientists have been able to do is apply various machine learning techniques, but most of these require a lot of training data — accurate transcriptions deciphered by humans and fed into an algorithm.

“Fundamentally, I don’t think that we’re going to see effective OCR for freeform cursive any time soon,” Brumfield said. “The big successes so far with machine recognition have been in domains in which there’s a really constrained possibilities for what is written down.”

That means entries like numbers. Dates. Zip codes. Get beyond that, and you’re out of luck.

I don't know much about the world of investigative journalism, but it wouldn't surprise me if it holds as many intriguing parallels and new challenges as I've discovered among natural science collections.   Handwriting might still be the most interdisciplinary technology.

Filed Under: press

Primary Sidebar

What’s Trending on The FromThePage Blog

  • 10 Ways AI Will Change Archives
  • More Than Round Trip: Using Transcription for…
  • How to Handle Racial or Ethnic Slurs &…
  • Guide to Digitizing Your Archives
  • An Interview With Julia Gearhart of Princeton's…
  • An Interview with Jeanie Fisher of the Seattle…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.