• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

OCR Correction vs Transcription

February 4, 2019 by Sara Brumfield

We found this recent comment by a volunteer on a FromThePage project to be fascinating:

"I am sad to report I have found numerous errors, too many to even begin to fix, within these pages... It will be much easier to completely transcribe from the beginning correctly, than try and fix ALL the typos. Would you like me to do this for the Library? "

OCR correction is arguably easier than full transcription, but based on this volunteer comment it is less fun and more frustrating.

We spoke to the project owner for this particular project, and she's hoping that high school aged volunteers, who are on site and working together, might be a good match for OCR correction.  Often younger transcribers have less experience with cursive handwriting, so I'm curious to see if this idea works.

We've seen OCR correction work on projects like Trove -- the OCR feeds the search, and the person reading the article is motivated to fix the mistakes when they run into them, because they are emotionally invested in the material. (No one likes mistakes!)

We've also seen great projects, like the Alabama Department of Archives and History's WW1 service cards project, that engage volunteers to transcribe typewritten text.

We also suspect that the proportion of errors in the text make a difference for the corrector's experience.

If you are thinking about doing an OCR correction project, we would recommend you think about:

  1.  How interesting is the text to start with?  Is it fun to read?
  2. How engaged are volunteers with the text to start with?  Do they have a reason to read it, which might lead to a reason to fix it?
  3. What's the proportion of errors in the text?  The higher the proportion, the more frustrating the experience, the more likely a transcription of typewritten text would make sense.

You may also be interested in this article Ben wrote with a more technical review of OCR correction in FromThePage.

If you'd like to start a transcription -- or OCR correction -- project in FromThePage, contact us and we'll get you started.

Filed Under: Uncategorized Tagged With: crowdsourcing

Primary Sidebar

What’s Trending on The FromThePage Blog

  • 10 Ways AI Will Change Archives
  • How to Handle Racial or Ethnic Slurs &…
  • Have you seen FromThePage's new Find A Project?
  • Can Multi-Modal LLMs Transcribe Historic Documents?
  • Guide to Digitizing Your Archives
  • Make Review Easier Part 3: Exporting & Scanning…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.