We interviewed the team at the Stanford University Archives on how they're using FromThePage. We think you'll enjoy hearing how they are "round tripping" the transcriptions back into their library systems using IIIF, engaging volunteers with the help of a student intern, and funding the project. Thanks to the team at Stanford for taking the time to talk to us!
What are your goals for the project?
We’re committed to providing broad public access to our collections, so the fact that some our most important historical materials are handwritten has proven to be problematic. Although we’ve digitized many of these documents, the scans aren’t machine readable; they can’t be keyword searched, and are difficult for users to both discover online, and parse for important topics and subjects. We see our partnership with FromThePage as a great way to empower Stanford community members to help resolve this issue by transcribing some of the most interesting and important handwritten correspondence in our collections.
In addition, since Stanford Libraries is pioneering IIIF, we are using this project to test the round tripping of IIIF manifests from the Stanford Digital Repository to FromThePage and back with annotations. We hope to take advantage of this workflow for use in online exhibits utilizing the Spotlight platform to provide full-text searching of the transcriptions created with FromThePage.
Tell us about your documents.
The Archives’ accessioned collections include hundreds of thousands of handwritten documents and photographs, many of which would benefit from crowdsourced transcription and annotation. For now, we’re primarily focusing on some of our flagship collections: including the letters of of our founders, Jane Lathrop Stanford and Leland Stanford, and the letters of their son and the university’s namesake, Leland Stanford, Jr. Correspondence and related materials from other prominent individuals, including Sarah Winchester and Eadweard Muybridge, are also featured, as is a groundbreaking survey of women’s sexual habits and activities conducted by faculty member Clelia Mosher in 1892 -- over five decades before similar work undertaken by Alfred Kinsey.
In fact, we’re especially interested in enabling access to handwritten materials that are representative of Stanford women, communities of color, LGBTQIA individuals, and activists, in line with the Stanford Archives’ ongoing initiative to collect, process, make accessible, and promote materials generated by these communities.
Students and alumni represent two of our primary constituencies for this project, and so we are also using this opportunity to feature handwritten letters documenting the student experience of major 20th century events, such as the 1906 SF Earthquake, WWI, and WII, as well as other handwritten materials relating to student groups, such as the minutes of Chinese Club, and Movimiento Estudiantil Chicanx de Aztlán (MEChA).
Where did the funding for your project come from?
Funding for this project was secured through an internal grant encouraging innovative projects by Stanford Libraries staff, named for a late Stanford Professor of History and longtime friend of the library, Payson J. Treat.
How are you recruiting or finding volunteers? Since this has a big alumni component, how are you engaging alumni?
By highlighting handwritten materials representing a broad cross section of authors, topics, and interests, we’re trying to engage a wide variety of participants, including undergraduate and graduate alumni, as well as others within and beyond the Stanford community, who are drawn to both the specific content as well as its connection to Stanford and California history.
As a bit of background, we have several other projects in place to engage students and alumni, including the Stanford Alumni Legacy project and Stanford Stories project. We see this project as a way to extend our alumni engagement by harnessing the energy and interest of alumni beyond the activities we regularly undertake in conjunction with Reunion Homecoming. Further, by offering alumni the ability to participate online according to their own schedule, we hope the project may reach and appeal to alumni who do not regularly attend other alumni events, including homebound alumni.
Can you share your experience using FromThePage?
We soft launched the Stanford Letters project in late fall 2017, and we’ve been very pleased with the level of user activity and the feedback so far. Additionally, FromThePage has provided great support! We are in the process of expanding our marketing efforts to attract more users.
Tell us about your integration plans. How is IIIF a part of this project?
One of our motivations for partnering with FromThePage on this project was their commitment to IIIF support and integrations. We’re especially impressed with how FromThePage has recently extended IIIF support to include links to all relevant export formats at both the document and work level. Given the size of some of our collections, we worked with the team to expand FromThePage’s functional support for importing and exporting robust IIIF manifests. The fruits of efforts are now available to all FromThePage users.
Anything else you'd like to tell us?
We’re so grateful for our volunteers’ efforts, and we want to make sure that we are as responsive to their feedback and questions as possible. To that end, we’re really excited to share that we have a new intern from San Jose State University’s School of Information starting this week, whose focus will be on raising awareness about this project, and helping nurture and grow our wonderful volunteer base!
---------------
Interested in crowdsourcing? Start a trial project on FromThePage.
Just want to talk crowdsourcing, IIIF, or how to round trip transcriptions back into your library systems? Drop Sara a line at saracarl@fromthepage.com.
KARI SMITH says
Thanks for sharing out this project info. I’d like to know about the QA of and policies for attribution of crowdsourced transcription. How much time goes into QA of the transcriptions before that is put into the library systems?
Ben Brumfield says
From a technical perspective, projects on FromThePage follow a single-track system, in which a single version of the transcription is edited collaboratively by one or more volunteers or staff members. (This contrasts with the multi-track/double-keying approach used by some other platforms, in which volunteers work on their own version of the same text independently, and those versions are somehow reconciled by staff members.)
The platform provides a “needs review” status flag, which can be used in two modes. Some organizations (including Stanford University Archives) configure their projects so that any initial transcript automatically sets the “needs review” flag for the page; a second pass must be taken by someone who reviews the transcript and clears the flag. Other organizations allow transcribers to edit at will, and only request a review if they are unsure of their reading, or would like a second set of eyes on the page.
The completion status of each work is exposed to volunteers through the user interface (allowing them to locate and review works needing attention) and to the institution on an “export” screen and via an API. Only works that have been fully transcribed/OCR corrected, and have no pages still needing review are considered 100% complete. (Fine-grain percentages like “percent needing review”, “percent marked blank”, etc. are also exposed in the API.)
Ben Brumfield says
Attribution within the crowdsourcing platform is closely tied to version control and change tracking, so that every contribution lists the “display name” of the volunteer who made the edit. These names are also listed by frequency of contribution project leader boards, though those leader boards are not prominent on the site.
Again, I’m writing about technical capabilities of the crowdsourcing platform, rather than processes within the institution. We have recently added an API for accessing volunteer contributions, and your question made us realize that the API does not provide a mechanism for communicating credits. This oversight will be rectified shortly.