• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections
Home » ocr

ocr

Improving OCR using FromThePage

February 27, 2019 By Sara Brumfield

This is a response to the recently published "A Research Agenda for Historical and Multilingual Optical Character Recognition" by David A. Smith and Ryan Cordell, with the support of The Andrew W. Mellon Foundation.  The report analyzes current challenges faced by humanities researchers using OCR text and outlines important avenues for research to improve OCR quality. … [Read more...] about Improving OCR using FromThePage

Crowdsourcing the Alabama World War I Service Records

August 4, 2018 By Ben Brumfield

On August 2, 2018, Meredith McDonough of the Alabama Department of Archives and History and Ben Brumfield of Brumfield Labs presented "Crowdsourcing the Alabama World War I Service Records" at the CONTENTdm User Meeting.  Our fellow panelists were Phil Sager and Kristen Newby of the Ohio History Connection, who presented on the crowdsourced transcription system they had built … [Read more...] about Crowdsourcing the Alabama World War I Service Records

Detecting Handwriting in OCR Text

February 25, 2013 By Ben Brumfield

This is my fourth and final post about the iDigBio Augmenting OCR Hackathon.  Prior posts covered the hackathon itself, my presentation on preliminary results, and my results improving the OCR on entomology specimens.  The other participants are  slowly adding their results to the hackathon wiki, which I recommend checking back with (their efforts were much more … [Read more...] about Detecting Handwriting in OCR Text

Results of the "Ocrocrop" Approach to Improving OCR

February 15, 2013 By Ben Brumfield

This project attempted to improve the quality of OCR applied to difficult entomology images[*] by cropping labels from the images to run through OCR separately. In order to identify labels on the image to crop, an initial, 'naive' pass of OCR was made over the whole image, generating both A) a set of rectangles on the image defined as word bounding boxes by the OCR engine, … [Read more...] about Results of the "Ocrocrop" Approach to Improving OCR

iDigBio Augmenting OCR Hackathon

February 15, 2013 By Ben Brumfield

I spent the last three days at the iDigBio Augmenting OCR Hackathon working alongside mycologists, botanists, entomologists, herbarium managers, and bioinformaticians to explore ways to improve parsing of digitized specimen labels.  While I'm pleased with the results of my own contribution, I'd like to take a minute to talk about the hackathon process itself before I post … [Read more...] about iDigBio Augmenting OCR Hackathon

Next Page »

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Classifying the Mistakes We Make When We Transcribe
  • Archives as an Antidote for ChatGPT
  • Start Reading Old Handwriting: Some Recommended Books
  • How Do I Read Old Handwriting?
  • An Interview with Dr. Camille Westmont of Sewanee:…
  • Interview: Sonya Coleman on Transcribing for the…

Recent Client Interviews

An Interview with NC State University Libraries

An Interview with Richard Gilreath of the Texas State Library and Archives Commission

An Interview with Julanne Neal of the Queensland State Archives

An Interview with Andrea Meyer of East Hampton Public Library

An Interview with Keith Mitchell of The National Archives (UK)

Read More

artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata ocr paleography podcast Ryan White spreadsheet transcription transcription transcription software
Privacy Policy | Terms & Conditions | About Us | Contact Us

Copyright © 2023 · FromThePage.com