• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

about crowdsourcing, manuscript transcription, digital humanities and digital documentary editions

  • Home
  • Project Profiles
  • Interviews with Clients
  • Collections
  • Back to FromThePage

OCR Correction vs Transcription

February 4, 2019 By Sara Brumfield

We found this recent comment by a volunteer on a FromThePage project to be fascinating:

"I am sad to report I have found numerous errors, too many to even begin to fix, within these pages... It will be much easier to completely transcribe from the beginning correctly, than try and fix ALL the typos. Would you like me to do this for the Library? "

OCR correction is arguably easier than full transcription, but based on this volunteer comment it is less fun and more frustrating.

We spoke to the project owner for this particular project, and she's hoping that high school aged volunteers, who are on site and working together, might be a good match for OCR correction.  Often younger transcribers have less experience with cursive handwriting, so I'm curious to see if this idea works.

We've seen OCR correction work on projects like Trove -- the OCR feeds the search, and the person reading the article is motivated to fix the mistakes when they run into them, because they are emotionally invested in the material. (No one likes mistakes!)

We've also seen great projects, like the Alabama Department of Archives and History's WW1 service cards project, that engage volunteers to transcribe typewritten text.

We also suspect that the proportion of errors in the text make a difference for the corrector's experience.

If you are thinking about doing an OCR correction project, we would recommend you think about:

  1.  How interesting is the text to start with?  Is it fun to read?
  2. How engaged are volunteers with the text to start with?  Do they have a reason to read it, which might lead to a reason to fix it?
  3. What's the proportion of errors in the text?  The higher the proportion, the more frustrating the experience, the more likely a transcription of typewritten text would make sense.

 

You may also be interested in this article Ben wrote with a more technical review of OCR correction in FromThePage.

If you'd like to start a transcription -- or OCR correction -- project in FromThePage, contact us and we'll get you started.

Filed Under: Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • How to Learn to Read Shorthand
  • An Interview with Riley Bogran of the Sandy Spring Museum
  • 2018 Paleography Courses
  • Interview: Dr. Laura Morreale on Teaching and…
  • Rails: acts_as_list Incantations
  • Learn to Decipher Old Handwriting with Online and…

Recent Client Interviews

An Interview with Erin Wilson of Ohio University Libraries

An Interview with Susannah Ural of the Civil War & Reconstruction Governors of Mississippi Project

An Interview with Olivia Carlisle of the State Archives of North Carolina

An Interview with Paige Roberts of Phillips Academy Archives & Special Collections

An Interview with Riley Bogran of the Sandy Spring Museum

Privacy Policy | Terms & Conditions | About Us | Contact Us

Copyright © 2021 · FromThePage.com