We can’t rewrite history, but re-typing it? That’s a different story — and it’s a story Harvard Library wants to tell with users’ help.
The library’s innovative new project invites the public to help transcribe its collection of digitized colonial-era materials from archives and libraries across the University. It is the first library-wide, crowd-sourced transcription project in Harvard’s history.
At the end of March, Harvard University Archives staff began working with Harvard Library’s Digital Strategies and Innovation team and Library Technology Services to launch a platform for transcribing the handwritten materials from 18th-century North America. Two weeks later, the site went live. Harvard Library is now welcoming and encouraging transcription contributions from archivists, librarians, history buffs, and anyone seeking productive ways to stay occupied in the new normal.
The images being transcribed are from the 700,000-plus digitized pages of diaries, recipes, court files, medical records, and other documents that make up Harvard Library’s Colonial North America collection (CNA). The full CNA collection has been digitized over the past seven years by Harvard Library Digital Imaging staff, in close collaboration with Preservation Services and the holding repositories.
During the CNA digitization work, library staff began thinking about how to make the digital materials as accessible as possible, said archivist Ross Mulcare.
“These documents are really rich and historically important,” Mulcare explained, “but they come with some serious accessibility and discoverability challenges.”
In their original form these materials are difficult for anyone to read and comprehend, let alone someone who uses a screen reader or whose first language is not English. Mulcare said archivists determined the CNA documents should be transcribed in order to be more widely accessible. As far as how to complete the transcription, a large-scale public collaboration makes the most sense.
“Although there’s promising machine learning technology on the horizon,” Mulcare said, “currently the most sophisticated and accurate method for transcription of this type is tapping into the collective efforts of large groups of people.”
Anyone with a computer and an internet connection can assist with this massive transcription project; the more people contributing, the faster CNA materials can be more broadly accessible.
The project site organizes the scanned handwritten CNA materials by their original library location. The section from Andover-Harvard Theological Library, part of Harvard Divinity School, lets readers view and transcribe hymns or sermons, while Harvard Medical School’s Countway Library has materials like doctors’ notes on smallpox. Some of the materials to be transcribed are not written in English and tag other languages, including Spanish and Latin.
To contribute to the project, transcribers simply choose a handwritten document and re-type it into a text box using the listed transcription conventions. Each time a document is edited and saved, a new version of it is created and listed with its completed percentage. The site also includes options to tag a document “Needs Review,” if a transcriber wants a second set of eyes.
Once transcription of an item is finished, typed texts will live alongside the handwritten originals, improving and expanding access to the CNA collection.
All are invited to sign up to be a transcriber or reviewer and be a part of history.