This is a response to the recently published "A Research Agenda for Historical and Multilingual Optical Character Recognition" by David A. Smith and Ryan Cordell, with the support of The Andrew W. Mellon Foundation. The report analyzes current challenges faced by humanities researchers using OCR text and outlines important avenues for research to improve OCR quality. … [Read more...] about Improving OCR using FromThePage
crowdsourcing
OCR Correction vs Transcription
We found this recent comment by a volunteer on a FromThePage project to be fascinating: "I am sad to report I have found numerous errors, too many to even begin to fix, within these pages... It will be much easier to completely transcribe from the beginning correctly, than try and fix ALL the typos. Would you like me to do this for the Library? " OCR correction is arguably … [Read more...] about OCR Correction vs Transcription
DH Project Ideas from the Texas AI Summit
Friday I attended the Texas AI Summit, a one day AI-focused conference conveniently in my hometown. The fun of a conference like this is looking for techniques and tools that could be applied to Digital Humanities projects; the pain is sitting through so many eye bleeding talks with mathematical formulas for classifying data. Here are the two best ideas. You're welcome. 1) Use … [Read more...] about DH Project Ideas from the Texas AI Summit
Crowdsourcing the Alabama World War I Service Records
On August 2, 2018, Meredith McDonough of the Alabama Department of Archives and History and Ben Brumfield of Brumfield Labs presented "Crowdsourcing the Alabama World War I Service Records" at the CONTENTdm User Meeting. Our fellow panelists were Phil Sager and Kristen Newby of the Ohio History Connection, who presented on the crowdsourced transcription system they had built … [Read more...] about Crowdsourcing the Alabama World War I Service Records
10 Ways to Host a Great Transcribathon
What's a transcribathon? In the words of Paul Dingman of the Folger Shakespeare Library, on the very first transcribathon in 2014: “Transcribathon, an event running from noon to midnight in which we transcribe and encode manuscripts..." Your transcribathon doesn't have be a 12 hour long marathon, but we've gathered a list of ideas from other transcribathons that … [Read more...] about 10 Ways to Host a Great Transcribathon