Recently, both OpenAI and Google released new multi-modal large language models, where were immediately touted for their ability to transcribe documents. Also last week, I transcribed this document from the Library of Virginia’s collection from the Virginia Revolutionary Conventions. Let’s read it together: Whereas his Excellency John Earl of Dun- more Lieutenant and … [Read more...] about Can Multi-Modal LLMs Transcribe Historic Documents?
Connection Makes Quality
I recently saw a presentation by Dick Kasperowski and Olof Karsvall about research they’d done on The Detective Section, a crowdsourcing project at the Swedish National Archives. Their findings contradicted some of the conventional wisdom that volunteers introduce bias into results when they bring their own subject-matter expertise to citizen science tasks, and that made me … [Read more...] about Connection Makes Quality
Measuring Success in Crowdsourcing Projects
On July 28, Sara and I hosted the first FromThePage User Group Meeting in Washington DC. It was a fun event--to me it felt more like a party than a meeting, since I got to spend most of my time introducing my colleagues to each other. (That said, cocktails were replaced with breakfast tacos, which we thought a little more appropriate for 8:30 AM.) I spent some days … [Read more...] about Measuring Success in Crowdsourcing Projects
Transcriptions, Screen Readers, and ChatGPT
Usually, when we transcribe documents, we type what we see, retaining irregular spelling and punctuation. But does this practice–a scholarly standard–serve everyone? I wondered how screen readers deal with verbatim transcriptions, so I ran an experiment. I used Microsoft Narrator, the screen reader built into Windows, to read passages aloud from a 19th-century … [Read more...] about Transcriptions, Screen Readers, and ChatGPT
What Machines Can’t Replace: Old Fashioned Human Efforts
Let’s talk about transcription in scholarly editions – those collections of “papers” by or about historical or literary figures, published for readers and researchers. There are basically three ways edition projects transcribe papers. The first is the old-school way with staff members doing it manually. The second is a group effort, which can be done through crowdsourcing … [Read more...] about What Machines Can’t Replace: Old Fashioned Human Efforts