• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

What We've Learned About HTR

November 7, 2024 by Sara Brumfield

We’ve spent the last 6 months working with our AI Assist Development Partners using their large variety of archival documents to gather feedback on both our AI Assist and AI Draft features, as well as what kinds of documents are good candidates for handwritten text recognition using Transkribus’ super models. Today, we’re sharing some of our interesting findings on the second. This is a long one, but we think you’ll find it interesting.

What Worked

 “Very difficult” handwriting

Humans can read handwriting like this, but it takes a lot of effort and experience. HTR did a great job.

Original – UNC Cameron Papers

AI-Assist

Bleed through

Ink that bleeds through the page makes the front of the page hard to read. HTR didn’t have any problem identifying the “main” text on the page and ignoring the bleed through. 

Original – UNC Cameron Papers

AI-Assist

What Didn’t Work

Cross Hatched Writing

Because HTR services first do segmentation – and expect text to be linear – cross writing or cross hatched writing both wasn’t transcribed and caused problems with the horizontally oriented writing.

Original – UNC Cameron Papers

AI-Assist

Old, Faded, Damaged Documents

This is in-between – it half worked. I think it’s about as good as a human could do.

Original – UNC Cameron Papers

AI-Assist

Text in Pencil

This was a surprise – when you have text written in pencil, especially when there is inked text on the same page – the results were particularly poor.

Carnegie Hall Archives

Carnegie Hall Archives

Text in Red Pen

"Like the pencil, but even worse, was the text written in red pen."

Carnegie Hall Archives

Text Written in Between Lines

In this case multiple lines are squeezed into the end of a line, and the HTR service only picked up some of them.

Carnegie Hall Archives

Filed Under: Uncategorized Tagged With: newsletter

Primary Sidebar

What’s Trending on The FromThePage Blog

  • 10 Ways AI Will Change Archives
  • How LLMs Work & A Handwritten Text Recognition Sandbox
  • An Interview with Joseph Riedel of Fort Worth Public Library
  • An Archivist's Tale Podcast - The Power of These…
  • Learn to Decipher Old Handwriting with Online and…
  • WWII Letters: National Pearl Harbor Remembrance Day

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.