• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

about crowdsourcing, manuscript transcription, digital humanities and digital documentary editions

  • Home
  • Project Profiles
  • Interviews with Clients
  • Collections
  • Back to FromThePage

Google Reads Fraktur

February 7, 2008 By Ben Brumfield

Yesterday, German blogger Archivalia reported that the quality of Fraktur OCR at Google Books has improved. There are still some problems, but they're on the same order of those found in books printed in Antiqua. Compare the text-only and page-image versions of Geschichte der teutschen Landwirthschaft (1800) with the text and image versions of Antigua Altnordisches Leben (1856).

This is a big deal, since previous OCR efforts produced results that were not only unreadable, but un-searchable as well. This example from the University of Michigan's MBooks website (digitized in partnership with Google) gives a flavor of the prior quality: "Ueber den Ursprung des Uebels." ("On the Origin of Evil") results in "Us-Wv ben Uvfprun@ - bed Its-beEd."

It's thrilling that these improvements are being made to the big digitization efforts — my guess is that they've added new blackletter typefaces to the OCR algorithm and reprocessed the previously-scanned images — but this highlights the dependency OCR technology has on well-known typefaces. Occasionally, when I tell friends about my software and the diaries I'm transcribing, I'm asked, "Why don't you just OCR the diaries?" Unfortunately, until someone comes with a OCR plugin for Julia Brumfield (age 72) and another for Julia Brumfield (age 88), we'll be stuck transcribing the diaries by hand.

Filed Under: Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • How to Learn to Read Shorthand
  • An Interview with Riley Bogran of the Sandy Spring Museum
  • 2018 Paleography Courses
  • Interview: Dr. Laura Morreale on Teaching and…
  • Rails: acts_as_list Incantations
  • Learn to Decipher Old Handwriting with Online and…

Recent Client Interviews

An Interview with Erin Wilson of Ohio University Libraries

An Interview with Susannah Ural of the Civil War & Reconstruction Governors of Mississippi Project

An Interview with Olivia Carlisle of the State Archives of North Carolina

An Interview with Paige Roberts of Phillips Academy Archives & Special Collections

An Interview with Riley Bogran of the Sandy Spring Museum

Privacy Policy | Terms & Conditions | About Us | Contact Us

Copyright © 2021 · FromThePage.com