• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

Transcriptions, Screen Readers, and ChatGPT

May 12, 2023 by Ben Brumfield

Usually, when we transcribe documents, we type what we see, retaining irregular spelling and punctuation.  But does this practice–a scholarly standard–serve everyone?  I wondered how screen readers deal with verbatim transcriptions, so I ran an experiment.  I used Microsoft Narrator, the screen reader built into Windows, to read passages aloud from a 19th-century farm diary, a 17th-century beer recipe, and a part of the 15th-century Prose Alexander. 

To my surprise, the 17th-century brewing recipe was the clearest to understand. Although it did pronounce the final "e" in "worke" and did not modernize "yest" to "yeast," the context made the meaning clear. At worst, the pronunciation of "u" for modern "v" in words like "uery" or "euening" was a bit distracting.

The Middle English text was also fairly reasonable. Narrator did a decent job pronouncing words like "swilke." However, some spellings that a reader might modernize were vocalized literally in ways that confused me. For example, "Whi, ame I thi son?" was pronounced "Whee, amey I thee son?" which I heard as "Whee, am I the sun?".

The worst text was the 19th-century farm diary. Narrator spelled out abbreviations like "hhds," instead of expanding them to “hogsheads”, but the terse wording of the entries was the biggest problem. The entry dates were spoken as numbers dropped into the text seemingly at random, and any passage in which the diarist did not write complete sentences sounded like a jumble of words.

This experiment showed that comprehensibility depends more on the nature of the text than on the age of the language or irregularity of the spelling. So how can we make such texts accessible? 

Paraphrasing is one of the things ChatGPT is supposed to be good at, so I ran the farm diary through it to see if there was any improvement:

The paraphrase has some serious problems: ”balance” is misinterpreted, “On the 26th there was a fine rain yesterday” is awkward, and the last entry is missing “grows” (for “gross”) and replaces the offensive “negroes” with “slaves”. But does the paraphrase cause more problems for a screen reader user than the verbatim text?

After my experience using Narrator on the farm diary, I have to say that the paraphrase is an improvement despite the errors introduced by ChatGPT. No human reading “hhds” aloud would say “H. H. D. S.” unless they had never encountered the term, so doing so is an error. Stringing together passages separated by line breaks–another screen reader behavior on verbatim text–also presents a massive barrier to readers of the text. 

It’s important to note that the AI paraphrase was not able to help in passages that require special historical knowledge. Without annotation by a scholarly editor, the financial transaction involving Stephen Clement remains as opaque in the AI version as in the original.

So I really think that AI might be a real help for making verbatim texts more accessible to the blind and visually impaired. It’s not as good as the normalized version a well-funded scholarly edition might present as an optional format, but it’s a lot more cost-effective. Using AI for increasing access to our texts (rather than cheating on essays or spamming people) gets me excited.

Filed Under: Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Guide to Digitizing Your Archives
  • 10 Ways AI Will Change Archives
  • An Interview with Keith Mitchell of The National…
  • How to Handle Racial or Ethnic Slurs &…
  • Project Profile: Stanford University Archives
  • An Interview with Rebecca Dillmeier of the United…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.