• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

Can LLMs Help With Boring Forms?

June 16, 2025 by Sara Brumfield

Two weeks ago, I looked at Ben and said "What we obviously should build next is field-based AI-Assist."  Transcribing forms or "spreadsheet-like" ledgers and rolls is not as much fun as transcribing letters or field books, so anything that makes the process easier might appeal to volunteers.  And because the text is more "data" than "text", it should suffer less from plausibly seductive AI hallucinations than narrative material.

Then Ben suggested "what if we ran pages through two different models and compared the results?  We could highlight places where two models differed in their interpretation of the text on the page, and ask volunteers to be the tie breakers."

In AI terms this is called "consensus based validation," although usually folks throw a lot of models at the task and forget that humans might be better tie breakers.

Lucky for us, the perfect project arrived during these conversations.Nick Zmijewski at the Industrial Archives & Library, emailed me, “We have a bunch of these engineer designs and they all have pretty standard data, written in clear draftsman hand, in a box on the bottom right. Is there a way we could automate this?"

AI experiments are fun, so we did two.  The first with our colleague Mike Cooper-Stachowsky, who ran our three sample cards through the open source model Qwen-VL 72b.  I ran the same cards through Gemini 2.0 Flash.  Then we entered each of the results of each run into FromThePage as if it was a manual transcription, so we could use FromThePage's "versions" tab for a page to see the differences between each model.  

Here's what we learned:

The results from both models were really good.  Not perfect, but surprisingly high quality. 

Even with the surprising quality, our transcriptions differed quite a bit between models.  If we were to flag the pages with any differences for human review, we'd end up flagging every single page.  This suggests that some projects that want to automate more should simplify the task by reducing the number of fields collected.   -- For example, if "scale" wasn't an important piece of metadata to collect, leaving it out would increase the consensus of the two models, since that was an area they often disagreed on.  

FromThePage "versions" tab showing differences between the two models

For this experiment, the Gemini model was better than the open source model, filling in more fields and producing fewer character errors..

Sometimes, the results differed between different runs of the same image against the same model.  The differences tended to be the harder- to- read fields like names, so I thought it was a good measure of "uncertainty" -- if the same "brain" (LLM) interprets the same text differently from one moment to the next, wouldn't that mean these fields were more difficult?  That suggests that using this sort of model "voting" might work even with just one model.

We’re excited about this approach of machines and humans collaborating, with machine generated results pointing humans to the parts of projects that most need judgement and interpretation.

Have a similar project you’d like to us to experiment with?  Let us know!

Filed Under: ai & crowdsourcing, Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • How Do You Know Whether AI Is "Good Enough"?
  • 10 Ways AI Will Change Archives
  • Guide to Digitizing Your Archives
  • How LLMs Work & A Handwritten Text Recognition Sandbox
  • Introducing Gemini 3.0 Support in FromThePage
  • How to Handle Racial or Ethnic Slurs &…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.