Who knew you could track climate change through crowdsourced transcription? The smart folks at the U. S. Geological Survey, that’s who!
The USGS North American Bird Phenology program encouraged volunteers to submit bird sightings across North America from the 1880s through the 1970s. These cards are now being transcribed into a database for analysis of migratory pattern changes and what they imply about climate change.
There’s a really nice DesertUSA NewsBlog article that covers the background of the project:
The cards record more than a century of information about bird migration, a veritable treasure trove for climate-change researchers because they will help them unravel the effects of climate change on bird behavior, said Jessica Zelt, coordinator of the North American Bird Phenology Program at the USGS Patuxent Wildlife Research Center.
That is — once the cards are transcribed and put into a scientific database.
And that’s where citizens across the country come in – the program needs help from birders and others across the nation to transcribe those cards into usable scientific information.
CNN also interviewed a few of the volunteers:
Bird enthusiast and star volunteer Stella Walsh, a 62-year-old retiree, pecks away at her keyboard for about four hours each day. She has already transcribed more than 2,000 entries from her apartment in Yarmouth, Maine.
“It’s a lot more fun fondling feathers, but, the whole point is to learn about the data and be able to do something with it that is going to have an impact,” Walsh said.
Let’s talk about the software behind this effort.
The NABPP is fortunate to have a limited problem domain. A great deal of standardization was imposed on the manuscript sources themselves by the original organizers, so that for example, each card describes only a single species and location. In addition, the questions the modern researchers are asking of the corpus also limits the problem domain: nobody’s going to be doing analysis of spelling variations between the cards. It’s important to point out that this narrow scope exists in spite of wide variation in format between the index cards. Some are handwritten on pre-printed cards, some are type-written on blank cards, and some are entirely freeform. Nevertheless, they all describe species sightings in a regular format.
Because of this limited scope, the developers were (probably) able to build a traditional database and data-entry form, with specialized attributes for species, location, date, or other common fields that could be generalized from the corpus and the needs of the project. That meant custom-building an application specifically for the NABPP, which seems like a lot of work, but it does not require building the kind of Swiss Army Knife that medieval manuscript transcription requires. This presents an interesting parallel with other semi-standardized, hand-written historical documents like military muster rolls or pension applications.
One of the really neat possibilities of subject-specific transcription software is that you can combine training users on the software with training them on difficult handwriting, or variations in the text. NABPP has put together a screencast for this, which walks users through transcribing a few cards from different periods, written in different formats. This screencast explains how to use the software, but it also explains more traditional editorial issues like what the transcription conventions are, or how to process different formats of manuscript material.
This is only one of the innovative ways the NABPP deals with its volunteers. I received a newsletter by email shortly after volunteering, announcing their progress to date (70K cards transcribed) and some changes in the most recent version of the software. This included some potentially-embarrassing details that a less confident organization might not have mentioned, but which really do a lot. Users may get used to workarounds to annoying bugs, but in my experience they still remember them and are thrilled when those bugs are finally fixed. So when the newsletter announces that “The Backspace key no longer causes the previous page to be loaded”, I know that they’re making some of their volunteers very happy.
In addition to the newletter, the project also posts statistics on the transcription project, broken down both by volunteer and by bird. The top-ten list gives the game-like feedback you’d want in a project like this, although I’d be hesitant to foster competition in a less individually-oriented project. They’re also posting preliminary analyses of the data, including the phenology of barn swallows, mapped by location and date of first sighting, and broken down by decade.
Congratulations to the North American Bird Phenology Program for making crowdsourced transcription a reality!