• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

about crowdsourcing, manuscript transcription, digital humanities and digital documentary editions

  • Home
  • Project Profiles
  • Interviews with Clients
  • Collections
  • Back to FromThePage

ocr

Improving OCR using FromThePage

February 27, 2019 By Sara Brumfield

This is a response to the recently published "A Research Agenda for Historical and Multilingual Optical Character Recognition" by David A. Smith and Ryan Cordell, with the support of The Andrew W. Mellon Foundation.  The report analyzes current challenges faced by humanities researchers using OCR text and outlines important avenues for research to improve OCR quality.  In many … [Read more...] about Improving OCR using FromThePage

Crowdsourcing the Alabama World War I Service Records

August 4, 2018 By Ben Brumfield

On August 2, 2018, Meredith McDonough of the Alabama Department of Archives and History and Ben Brumfield of Brumfield Labs presented "Crowdsourcing the Alabama World War I Service Records" at the CONTENTdm User Meeting.  Our fellow panelists were Phil Sager and Kristen Newby of the Ohio History Connection, who presented on the crowdsourced transcription system they had built … [Read more...] about Crowdsourcing the Alabama World War I Service Records

Detecting Handwriting in OCR Text

February 25, 2013 By Ben Brumfield

This is my fourth and final post about the iDigBio Augmenting OCR Hackathon.  Prior posts covered the hackathon itself, my presentation on preliminary results, and my results improving the OCR on entomology specimens.  The other participants are  slowly adding their results to the hackathon wiki, which I recommend checking back with (their efforts were much more … [Read more...] about Detecting Handwriting in OCR Text

Results of the "Ocrocrop" Approach to Improving OCR

February 15, 2013 By Ben Brumfield

This project attempted to improve the quality of OCR applied to difficult entomology images[*] by cropping labels from the images to run through OCR separately. In order to identify labels on the image to crop, an initial, 'naive' pass of OCR was made over the whole image, generating bothA) a set of rectangles on the image defined as word bounding boxes by the OCR engine, … [Read more...] about Results of the "Ocrocrop" Approach to Improving OCR

iDigBio Augmenting OCR Hackathon

February 15, 2013 By Ben Brumfield

I spent the last three days at the iDigBio Augmenting OCR Hackathon working alongside mycologists, botanists, entomologists, herbarium managers, and bioinformaticians to explore ways to improve parsing of digitized specimen labels.  While I'm pleased with the results of my own contribution, I'd like to take a minute to talk about the hackathon process itself before I post … [Read more...] about iDigBio Augmenting OCR Hackathon

Next Page »

Primary Sidebar

What’s Trending on The FromThePage Blog

  • 2018 Paleography Courses
  • An Interview with Riley Bogran of the Sandy Spring Museum
  • How to Learn to Read Shorthand
  • Rails: acts_as_list Incantations
  • An Interview with Olivia Carlisle of the State…
  • Project Profile: Jawi Transcription Project

Recent Client Interviews

An Interview with Erin Wilson of Ohio University Libraries

An Interview with Susannah Ural of the Civil War & Reconstruction Governors of Mississippi Project

An Interview with Olivia Carlisle of the State Archives of North Carolina

An Interview with Paige Roberts of Phillips Academy Archives & Special Collections

An Interview with Riley Bogran of the Sandy Spring Museum

Privacy Policy | Terms & Conditions | About Us | Contact Us

Copyright © 2021 · FromThePage.com