William Brewster was 14 when he started chronicling the habitat of birds and wildlife in Cambridge. He went on to document the changing natural landscape of his hometown for more than 50 years from the late 1800s into the 1900s.
Brewster’s work, part of the collection at Harvard’s Ernst Mayr Library of the Museum of Comparative Zoology, represents an important resource in the study of the region’s natural history. One big problem, however, comes with transcribing the volumes of handwritten observations into digital text files that can be accessed and mined online.
A new initiative is underway to use gaming and crowdsourcing to speed the massive task of transcribing such documents, at Harvard and around the world.
The project, funded by a grant from the Institute of Museum and Library Services, enlists video gamers to help correct digital transcripts not easily converted into clean text files. Purposeful Gaming is a collaborative effort among the Missouri Botanical Garden, the Ernst Mayr Library, the New York Botanical Garden at Cornell University, and other members of an archives consortium called the Biodiversity Heritage Library.
“What we hope is that people with an interest in games — but who want to do something useful as well — will find these games to be the perfect answer,” said Ernst Mayr librarian Constance Rinaldo. “People who love beautiful books and are fascinated by early scientific exploration, natural history, and games have an opportunity to help improve discovery of concepts in handwritten notes and other documents that are difficult to automatically transcribe.”
The process under study is an alternative to optical character recognition (OCR), which converts images of text into encoded text files. OCR works well with uniform printed text, but less so with handwritten documents and certain typefaces.
Ernst Mayr was chosen to be a partner in the grant because it was one of the first libraries in the consortium to digitize field notes and diaries. About a dozen volumes of Brewster’s diaries were used to test the effectiveness of the games.
“Running these handwritten notes through optical character recognition is almost useless because it does not pick up the characters, and so it cannot be converted into text files reliably,” said Joseph deVeer, head of technical services in the Museum of Comparative Zoology. “In a nutshell, the Purposeful Gaming project was designed to feed multiple transcriptions through the game where players try to reconcile and determine which version is more accurate.”