• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

"The Landscape of Crowdsourcing and Transcription" at Duke University

November 23, 2013 by Ben Brumfield

I spent part of this week at Duke University with the Duke Collaboratory for Classics Computing -- Josh Sosin, Hugh Cayless, and Ryan Baumann. We discussed ideas for mobile epigraphy applications, argued about text encoding, and did some hacking. We loaded an instance of FromThePage onto the DC3's development machine, seeded it with the 1859 journal of Viscontess Emily Anne Beaufort Smyth Strangford (part of Duke Libraries' amazing collection of Women's Travel Diaries). Transcribing six pages of her tour through Smyrna and Syria together suggested some exciting enhancements for the transcription tool, revealing a few bugs along the way. I'm really looking forward to collaborating with the DC3 on this project.

On Wednesday, I gave an introductory talk on crowdsourced manuscript transcription at the Perkins Library: "The Landscape of Crowdsourcing and Transcription":

One of the most popular applications of crowdsourcing to cultural heritage is transcription. Since OCR software doesn’t recognize handwriting, human volunteers are converting letters, diaries, and log books into formats that can be read, mined, searched, and used to improve collection metadata. But cultural heritage institutions aren’t the only organizations working with handwritten material, and many innovations are happening within investigative journalism, citizen science, and genealogy.

This talk will present an overview of the landscape of crowdsourced transcription: where it came from, who’s doing it, and the kinds of contributions their volunteers make, followed by a discussion of motivation, participation, recruitment, and quality controls.

The talk and visit got a nice write-up in Duke Today, which includes this quote by Josh Sosin:

Sosin said that although many students and professors visit the library's collections and partially transcribe the sources that are pertinent to their research, nearly all of these transcripts disappear once the researchers leave the library.

"Scholars or students come to the Rubenstein, check out these precious materials, they transcribe and develop all sorts of interesting ideas about them," Sosin said. "Then they take their notebooks out of the library and we lose all the extra value-added materials developed by these students. If we can host a platform for students and scholars to share their notes and ideas on our collections, the library's base of knowledge will grow with every term paper or book that our scholars produce."

Video of "The Landscape of Crowdsourcing and Transcription" (by Ryan Baumann):

Slides from the talk:

Previous versions of this talk were delivered at University of Southern Mississippi (2013-09-12) and the Wisconsin Historical Society (2013-09-25). It differs substantially in the discussion of quality control mechanisms (on the video from 26:15 through 31:30, slides 37-40), an addition which was suggested by questions posed at USM and WHS.

Filed Under: presentations, videos Tagged With: crowdsourcing

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Guide to Digitizing Your Archives
  • How to Handle Racial or Ethnic Slurs &…
  • 10 Ways AI Will Change Archives
  • An Interview with Keith Mitchell of The National…
  • Project Profile: Stanford University Archives
  • An Interview with Rebecca Dillmeier of the United…

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.