• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

How can my archive crowdsource record indexing?

September 8, 2022 by Ana Bastida

Last May, Ben and Sara Brumfield hosted a webinar on successful crowdsourced indexing. The presentation, linked below in a video and embedded as slides, presents a walkthrough of how project staff selected material, recruited volunteers, developed instructions, encouraged engagement, and used volunteer contributions to improve public access. You can sign up for future webinars here.

Read through Ben and Sara's presentation: 

Successful Crowdsourced dIndexing
Sara Brumfield, Ben Brumfield
FromThePage

This is Sara Brumfield with FromThePage.  I’m here today with Ben, my partner, and we wanted to tell y’all what we’ve learned over a decade of running crowdsourcing projects, and how that applies specifically to indexing projects.

Agenda

Alabama World War I service card

Let’s start by defining indexing.  In our software FromThePage, Indexing is asking users to transcribe specific elements of a digitized document into structured data fields. 

This is an Alabama World War I service card, and you can see that volunteers are asked to separate out different portions of the name in the form, and aren’t asked to enter the soldier’s serial number, rank or transfer history.  

The project owner has configured the specific fields to be indexed, and the format to display them in.

Missouri Slave Schedule

Indexing may also include spreadsheet-style projects.  

Here is a project indexing the slave schedule to the 1860 United States Census, in which different structured data is being indexed.  This includes page-level fields for the header on the census form, but also has a spreadsheet to create multiple records per page.

May Wright Sewall Papers

These indexing projects differ from full-text transcription, like this example in which every word of the letter has been transcribed by volunteers. 

State and National Archives

Our indexing features were developed by and for state and national archives.  Funding and guidance came from the Council of State Archives in the US.

ADAH & Missouri State Archives logos
Images of Meredith McDonough, Steve Murray, Christina Miller, and John Dougan

We’re going to talk a lot about projects from the Alabama Department of Archives and History and the Missouri State Archives, so we wanted introduce the cast of characters; here’s the people who run the 2 projects we’ll talk most about.  On the left is Meredith McDonough, an archivist and digital assets coordinator at Alabama; then we have Steve Murray, the state archivist of Alabama.  On the right is Christina Miller, the reference services manager at Missouri State Archives, and John Dougan, the state archivist of Missouri.  Now you have some faces to put with the names.

What Material Should be Indexed?

Let’s talk for a minute about what kind of material should be indexed.

Index-worthy Material
Access is hard
Users need page-level indices
Lots of named individuals
Indices to other documents

Access is hard

Users need page-level indices

Lots of named individuals

Indices to other documents

Registration Oath

Meredith knows that just like they’ll never scan all of ADAH’s physical records, they will never transcribe all of their digitized records. Instead, they’re focusing on collections that would most benefit from extensive transcription, like this.

Texas State Library and Archives index

This is an index from the Texas State Library and Archives commission; they’re transcribing an index to court cases.  Those court cases are hard to find unless you page through the digital images of the original index; indexing this will increase the likelihood of the court case records being used.

Writing Instructions

Now let’s talk about writing instructions for your volunteers

"Help"
- Inline instructions
- On page instructions
- “Help” tab 
- Printable PDFs/downloadable manuals

We support three different types of help in FromThePage.

County Tax List

This is a county tax list from 1841 from the Missouri State Archives.  They’ve added these “instruction” blocks, that we highlight by a vertical bar, to help their transcribers.

The cool thing about this is that the volunteers are likely to read them, since they’re right next to the fields where they’re doing the work.  

They also give volunteers a place to put random stuff they find; the notes that can be associated with a section

You’ll also notice that they’re not collecting every single field that’s on the original.

Queensland State Archives

Here’s an example of transcription conventions from the Queensland State Archives’ Index to male prisoners admitted - HM Prison, Brisbane (Boggo Road) 1870-1928.

Wisconsin Historical Society

The third type of help that we have in FtP is what we call the help tab.  I’ve got a couple of different examples for this from different projects.

This is an example from the Wisconsin Historical Society.  We love how they annotated an image to point to different fields and included it in their help tab.

Cleburne County

The next four slides are all taken from the help tab of Alabama’s voter registration books project.

It’s really extensive.  Meredith created text that gave people a guide to the collection, some general instructions for handling common issues like irregular spellings or illegible text.

FtP Navigation

Hints on navigating the project within the FromThePage software.

Page Format

They also included an annotated scan with an explanation of each field they’re collecting.

This is kind of intimidating, but Alabama has been running projects for four years.  They’ve been able to build on previous projects -- and you can build on some of their practices, too!

World War I Records

People like having something they can print out and have next to them.  Alabama links from their help page to their PDF project guide.  This is actually the same material that’s in the “help” tab, reformatted for PDF.

Soft Launch

After you’ve uploaded material and defined instructions, you might want to do a soft launch. 

In fact I would argue that you can be doing this as you write your instructions, because the things you learn from the launch process will help you improve your documentation and project.

Public Launch

Once you’ve given the project a shake-down cruise, it’s time for a public launch.  You can hold an event  at the very beginning of a project and bring in people who are active in genealogy organizations, local historical organizations, or descendant communities.  

What this does is it allows you to introduce the material and project goals.  You can give hands-on instructions to get people over the discomfort of “What am I doing here?”  “Is my work good enough”?  And then they’re able to take that enthusiasm and that expertise back to their communities and enlist other people and encourage other people to join in.  

In addition to giving volunteers hands-on training, these events are also great publicity.  You can see the Julian Bond Transcribathon promoting its hashtag in a poster, with volunteers promoting the project on social media.   You can also see a post by a participant at a transcribathon at Cambridge talking about their excitement learning how to read medieval handwriting.  Which spreads the word even more.  

Missouri State Archives Event

The Missouri State Archives does a fascinating exercise at their launch events.  After they’ve explained the format of the material and given people instructions on how to index each field, John, the State Archivist, starts to index a page.  And he’s terrible at it!  They have the participants yell out what he’s doing wrong, and what he should be typing instead.  It makes the volunteers more confident in their own skills–after all, they just corrected the State Archivist–and reinforces the right way to handle common errors.

ADAH Tweet

Promoting the project through social media can be ongoing.  Meredith says, “we have been stunned by the number who learn about our projects through virtual word-of-mouth: Tweets, Retweets, shares, and likes.

1875 Voter Registration Books

Alabama keeps their large FromThePage projects private and then adds collaborators to the projects as they express interest, establishing a direct line of communication.  Meredith says that these emails often begin with “I just saw a post about your project on . . .” or “I’m not from Alabama, but . . ."

Image of Jeanie Fisher

We’ve also seen local news channels -- both print and television -- pick up these projects because they are interesting, they’re visual -- you have cool old documents, you have people they can come in and take pictures of.  And they are feel-good stories -- they are a way for the local community and retirees to contribute to your mission of preserving memories.

This is coverage by a local TV station of a project from the Seattle Municipal Archives.

Print Media Screenshots

Traditional print media coverage is also important.  The two busiest days on our software platform were over the Christmas holidays, when two newspapers published articles about a project at the North Carolina State Archives.  People together with their families and looking for something meaningful to do together – or perhaps looking for a productive escape.

Providing Guidance

One challenge for volunteer projects is to gently provide correction and guidance to users who are not following the instructions, or are enthusiastically following the wrong instructions.  And volunteers want to do good work!  But sometimes a skilled, enthusiastic volunteer from one project starts working on a different project that follows different conventions, but they are used to the way the old project did things. Sometimes this also happens when the project has given guidance at the beginning, but not to new volunteers.

British Library Manuscripts

That happened when the British Library transcribed its Arabic scientific manuscripts; they held a launch event in London that was well attended, and seeded the project with volunteers that understood the instructions well. Late in the project, a new volunteer discovered it and started “correcting” the medieval text to modernize the spelling, contrary to the goals of the project.

Correction in Notes

The project staff were able to intervene by leaving notes on the pages the user had edited, and gave some gentle guidance and encouragement to the user.

Often, however, individual correction can drive volunteers away, no matter how tactfully done.

Email Reminder

The Missouri State Archives addresses this problem by sending email to volunteers that look like regular updates to the whole volunteer pool, highlighting the issue they’re seeing and reminding the volunteers of their expectations. However, the email only is sent to the errant user; correcting them without singling them out.

Ending a Project

Let’s talk about what happens after the projects are done.  There are two things to think about here: what happens to the volunteer community, and what happens to the data?

Missouri State Archives Announcement

Just like we have a responsibility not to waste volunteers’ labor, we feel like successful projects recognize that volunteers make social connections through their work and have formed a real community.  When a project ends, volunteers can be saddened to lose the experience.  One option is to point them to other projects at your institution, as you see in this social media post.

eVolunteer Updates Email

However, you might not have a project ready, or the projects you do have may not match volunteers’ skills and interest.  Another option is to point volunteers to similar projects at other institutions, while reassuring them that you’ll let them know about new projects at your own institution when they are available. 

This email goes out to current volunteers for Missouri’s projects, but also former volunteers who might have moved on to other institutions' projects and would be interested in returning to Missouri when a new project goes online.

One Volunteer’s Contributions

And we do see people moving from one project to another.  

This is a graph of one volunteer’s contributions to four projects at three other state archives after the Alabama project was finished.

Acknowledgment

It’s important to acknowledge your volunteers in some way at the end of the project.  

Completed Projects

I really like how Meredith gives a count of volunteers records for these projects.

Lone Rock Stockade Convict Ledgers Project

This is an amazing social media post by the historian running the Lone Rock Stockade Convict Ledgers project.  

Extracting Data

The other part of finishing a project is extracting data.

Exports
- Spreadsheets/CSV
- PlainText
- HTML
- IIIF API + JSON

FromThePage supports several export formats; some indexing projects use more than one.

It’s important to figure out where your exported data will go.

This is a spreadsheet export of the Alabama WWI service card project.  Meredith pulled this out, and then she used this material to….

Alabama World War I Service Records: Aarons

Turn her digital “stacks” of cards, with just top-level metadata for 400-plus cards, into individual cards indexed by name, location, and 4 other fields.  You all know what this means to users, and how much time it saves, and how it makes finding the record you’re looking for possible.

Spreadsheet Export

This is a similar spreadsheet export of the Description Book from Police Gaol, Fortitude Valley from Queensland State Archives; this project was a spreadsheet transcription project, but the data that is exported is very similar.  This is a great starting point for analytical research work; in fact the Library of Virginia uploads their datasets into the Virginia Open Data Portal for academic researchers to use.

Description Book Females

HTML Export is less useful for indexing projects, but updating digital library systems with human-readable transcripts can still be valuable.

Page Level Text Export

What if you don’t have a place to put spreadsheet exports from ledger-style data, or are working with a system that has a single, plaintext transcript field?  This is our page level text export; it’s human readable, searchable, but it’s also future-proof because it’s a machine readable format called markdown.

“But these projects are also a potential method of outreach and advocacy. We individually benefit from the work done by our volunteers, but the entire historical community benefits from an increased public awareness of and appreciation for what we do. Crowdsourcing invites citizens to join in that work, giving them a sense of responsibility for the past we all share; a direct, personal connection to the people who made it; and a desire to support and protect the institutions that preserve it.”
  
- Meredith McDonough

We want to leave you with this quote from Meredith about their perspective, and open up the call for discussion.

Filed Under: Uncategorized

Primary Sidebar

What’s Trending on The FromThePage Blog

  • Guide to Digitizing Your Archives
  • 10 Ways AI Will Change Archives
  • FromThePage vs Zooniverse
  • What to Do When Your Transcribers Can’t Read Cursive
  • Learn to Decipher Old Handwriting with Online and…
  • Can the Crowd Create Metadata?

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.