Campus & Community

Online Reference Shelf Will Put Historical Data at Your Fingertips

6 min read

When researchers seek historical information about Harvard or Radcliffe, or even about the history of higher education in the United States, they often turn to primary sources in the Harvard and Radcliffe Archives. Most often, the quest begins with a browse through the many volumes of annual reports of the Harvard and Radcliffe presidents.

Annual reports have traditionally provided an opportunity for the presidents of both institutions to launch new initiatives and state their positions on a variety of topics.

Most of the reports also have included the individual annual reports of all of the major departments throughout Harvard and Radcliffe. Spanning from 1825 to 1995, these volumes thus offer detailed views of life at both institutions – from student demographics, faculty appointments, and the establishment of new academic disciplines, to philanthropy, extracurricular life, and the architectural history of the campuses.

However, because the reports are so broad in scope, finding specific information often requires an exhaustive page by page search. And while many of the reports are indexed, none of the indexes provide the degree of specificity required for in-depth historical research.

“If you wanted to know about the courses attended by Gertrude Stein, or when and why posture pictures were taken, or the history of coeducation at Radcliffe and Harvard, you would begin with the annual reports,” says Jane Knowles, director of the Radcliffe Archives and acting director of the Schlesinger Library. “And it would be a laborious process of searching through many volumes.”

With the advent of the Harvard-Radcliffe Online Historical Reference Shelf, a joint project conceived by the Harvard University Archives and the Radcliffe Archives and funded by the University’s Library Digital Initiative (LDI), using these resources will become as simple as navigating a Website.

As its name implies, the project is creating a virtual bookshelf where electronic versions of the reports, along with other historical reference sources, can be easily accessed and searched via the World Wide Web. The Reference Shelf’s online interface will feature searching and browsing capabilities. Lists of titles will have links to page images from the various volumes in their original format.

Researchers will be able to navigate through each volume page by page or section by section, and a search screen will enable them to find specific names, topics, statistics, and dates. Researchers will also have the option of printing out single pages or sections of volumes.

With a launch date planned for the fall of 2000, the Online Historical Reference Shelf will provide access to 270 volumes of Harvard reports published between 1825 and 1995 and 87 volumes of Radcliffe reports published from 1880 to 1988 as its first offering.

“The annual reports are truly gold mines of information about Harvard and Radcliffe, and electronic conversion will add considerable value to them,” says Harley Holden, University Archivist. “You can always come to the Archives and look at the original printed volumes, but the Online Reference Shelf will offer round-the-clock access from your desktop. Furthermore, fully searchable text will let people get right to the specific information they need.”

To prepare the reports for conversion to electronic form, the Harvard College Library Conservation Services disbound duplicate copies of each volume, resulting in an estimated 95,600 pages that were then scanned by staff at the HCL Digital Imaging Group. Where necessary, original volumes were kept intact and scanned using an overhead scanner.

The electronic image files have been sent to the University of Michigan, where sophisticated optical character recognition (OCR) software is being used to convert the image files to text. As the first files return from Michigan, staff in the Harvard and Radcliffe Archives are moving on to the next step in the process – checking the accuracy of the OCR translation by attempting a search on a subset of 5,000 pages.

If the searches are unsuccessful or reveal problems – a possibility with some of the oldest reports, which were printed in antiquated typefaces – those volumes will be typed in by hand to ensure effective and accurate search capability.

“When you go to the Reference Shelf and look at the reports, you’re not going to see the OCR text. You’re going to see images of the actual pages in the annual reports,” says Robin McElheny, of the Harvard Archives. “When you look for a particular word, the search will take place behind the scenes in the text files. A list of pages that contain your word will appear. If you go to a page and print it out, it will look like a photocopy of the original page.”

In a later stage, project staff will create electronic tables of contents for each report, which will allow users to quickly navigate from one section of a report to another.

As other historical sources about Harvard and Radcliffe are located or created, they too will be added to the Online Reference Shelf. Among the first additions will be links to The Harvard Book (1875), which was converted to electronic form in an earlier project, and an electronic version of A Century to Celebrate (1979), a history of Radcliffe published at its centennial.

“This will make visible the contributions of Radcliffe women who have been part of Harvard’s history for the last 120 years and have often been overlooked,” says Knowles. Later additions to the Reference Shelf will include links to historical resources already available online, such as the Harvard Fact Book, published by the Budget Office.

“We want the Reference Shelf to provide ‘one-stop shopping’ for basic historical information about Harvard and Radcliffe,” says McElheny.

Electronic conversion of the reports also serves archival purposes. Like many older documents, the reports were printed on acidic wood-pulp paper, which makes the documents vulnerable to damage from age and handling. Eventually, the electronic image files will be copied to microfilm to create a preservation master set.

“Preservation is another key benefit of the project,” says Knowles. “In their current form, the reports are a consumable resource. Pages are brittle and even broken. By converting them to electronic form and microfilm, we will improve their current availability and ensure their availability for future use.”

Along with other similar programs funded through LDI’s Internal Challenge Grant program, the success of the Harvard-Radcliffe Online Historical Reference Shelf prototype will make imminent a wealth of online resources for researchers.