Last month, I read "Imbalanced volunteer engagement in cultural heritage crowdsourcing: a task-related exploration based on causal inference", by Zhang, Zhang, Zhao and Zhu. The authors analyzed the Trove crowdsourcing platform at the National Library of Australia to look for patterns in contributions by volunteers correcting the OCR text of old newspaper articles. While I can’t evaluate their statistical methods, I’m impressed with the authors’ awareness of crowdsourcing practices within cultural heritage and their use of real-world library data instead of the simulations I often see in the literature.
The authors do a careful job to account for factors inherent in the newspaper data: most of their graphs break down data by the year of newspaper issue and the article type, which could easily affect volunteer interest or OCR quality (hence task difficulty). You can see that older issues attract more contributions, and that family notices are far more popular than lists like rail schedules and such.
The authors are focusing on "IVE" (Imbalanced Volunteer Engagement), the uneven distribution of volunteer activity and task completion rates within projects like Trove. They determined that task completion rates vary immensely by topic, and provide some concrete recommendations for alleviating the problem:
- Displaying lists of tasks to be completed can act as a nudge to point volunteers to neglected tasks. Lists provide an alternative route around volunteer assumptions about what they might be interested in.
- Volunteers choose tasks based on their subjective perception of how difficult a task will be, which might not be accurate. Task completion in Trove was inversely correlated to the number of words in an article, not the number of words requiring correction. Improving task design by making tasks more fine-grained or communicating objective difficulty might alleviate this.
- They recommend making neglected tasks more attractive through visual appeal.
I'm pretty impressed with their analysis and their recommendations, though I do wonder whether imbalanced volunteer engagement is a serious problem in cultural heritage. If volunteer priorities match institutional priorities, the “imbalance” may just represent volunteers working on the most important material first. If correct text in family notices are more important to Trove than correct text in railroad schedules or commodity prices, then I’d say that the system is working pretty well.
We’d love to hear from you – if you run or participate in a crowdsourcing project, do you think that some important tasks are neglected by most volunteers? If so, why, and what might be done?