• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

How Do You Know Whether AI Is "Good Enough"?

December 9, 2025 by FromThePage

Yesterday we deployed two new features to help you evaluate Gemini 3 (and eventually others) results against human transcribed or corrected text.

First, we’ve developed a comparison screen that shows the differences between an AI generated page transcription and human created ground truth:

Next, we calculate statistics, again comparing the AI draft against the human ground truth.

This gives you data – which we think is a good start – but how do you decide if it’s good enough? What’s good enough? Again, we look to Mark Humpries, who recently published a very thorough analysis of Gemini 3’s quality. Here are his three levels of quality using character error rate, suggesting the reliability of each and the level of human intervention which would be necessary at each level. (As someone who runs a human centered software platform, I am relieved there is still a place for us!)

A 3% error rate equates to about 3-4 errors per sentence, making the document a first draft at best but also fundamentally untrustworthy. An error rate of 1% means around one error per sentence, readable but still in need of significant and close proof reading. At 0.5%, a document becomes both usable and trustworthy with around 1-2 characters wrong on each page. If one planned to publish such a document, careful proof reading would still be necessary, but it would be more akin to copy-editing than re-interpretation."

It’s also interesting to look at the types of errors – a lot of differences in the screenshot above (from a letter in the Hagley Museum Archives) are not errors so much as differences in transcription choices. Capital letters instead of lower case (or vice versa) or how many dashes are used to represent a strikethrough.
We do know we’ll need to iterate on our prompt so Gemini’s output more closely matches our default transcription conventions. We’re also going to add projects’ custom transcription conventions to the default prompt and see how that goes.

The most reassuring thing in Mark’s analysis – reinforced by the results we’re seeing in the Gemini transcriptions and comparisons in FromThePage – is that Gemini doesn’t make things up:

The most remarkable thing, though, is that Gemini is so often able to push past the ruts created in training that want to steer it towards correcting historical spelling errors and capitalizations. Most of the time—99% in fact—it succeeds.
Hallucinations were entirely absent. By hallucinations, I mean insertions or replacements that are not derived from the text."

And that, more than anything else, is why we think this is a viable solution for archives.

Want to dive deeper?

  • Here are two public projects with Gemini transcription and human ground truth:

Wood Family Letters from the Hagley Library

Various inventions by Wilber Moore Stilwell and Gladys Ferree Stilwell from the University of South Dakota

  • Join our webinar on our Gemini 3 integration next Thursday, December 11th.
  • Start your own 200 page trial and click the “Generate AI Drafts” button as you import your own material.

Filed Under: ai & crowdsourcing, Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Introducing Gemini 3.0 Support in FromThePage
  • How Do I Read Old Handwriting?
  • Guide to Digitizing Your Archives
  • Start Reading Old Handwriting: Some Recommended Books
  • How Do You Know Whether AI Is "Good Enough"?
  • How to Handle Racial or Ethnic Slurs &…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2026 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.