• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

OCR Correction vs Transcription

February 4, 2019 By Sara Brumfield

We found this recent comment by a volunteer on a FromThePage project to be fascinating:

"I am sad to report I have found numerous errors, too many to even begin to fix, within these pages... It will be much easier to completely transcribe from the beginning correctly, than try and fix ALL the typos. Would you like me to do this for the Library? "

OCR correction is arguably easier than full transcription, but based on this volunteer comment it is less fun and more frustrating.

We spoke to the project owner for this particular project, and she's hoping that high school aged volunteers, who are on site and working together, might be a good match for OCR correction.  Often younger transcribers have less experience with cursive handwriting, so I'm curious to see if this idea works.

We've seen OCR correction work on projects like Trove -- the OCR feeds the search, and the person reading the article is motivated to fix the mistakes when they run into them, because they are emotionally invested in the material. (No one likes mistakes!)

We've also seen great projects, like the Alabama Department of Archives and History's WW1 service cards project, that engage volunteers to transcribe typewritten text.

We also suspect that the proportion of errors in the text make a difference for the corrector's experience.

If you are thinking about doing an OCR correction project, we would recommend you think about:

  1.  How interesting is the text to start with?  Is it fun to read?
  2. How engaged are volunteers with the text to start with?  Do they have a reason to read it, which might lead to a reason to fix it?
  3. What's the proportion of errors in the text?  The higher the proportion, the more frustrating the experience, the more likely a transcription of typewritten text would make sense.

You may also be interested in this article Ben wrote with a more technical review of OCR correction in FromThePage.

If you'd like to start a transcription -- or OCR correction -- project in FromThePage, contact us and we'll get you started.

Filed Under: Uncategorized Tagged With: crowdsourcing

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Archives as an Antidote for ChatGPT
  • An Interview with Michael Lapides of the New Bedford…
  • How Do I Read Old Handwriting?
  • An Interview with Dr. Camille Westmont of Sewanee:…
  • Learn to Decipher Old Handwriting with Online and…
  • Spreadsheet Transcription in FromThePage

Recent Client Interviews

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

An Interview with Richard Gilreath of the Texas State Library and Archives Commission

An Interview with Julanne Neal of the Queensland State Archives

An Interview with Andrea Meyer of East Hampton Public Library

Read More

artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata ocr paleography podcast Ryan White spreadsheet transcription transcription transcription software
Privacy Policy | Terms & Conditions | About Us | Contact Us

Copyright © 2023 · FromThePage.com