Campus & Community

Google to digitize some Harvard library holdings

4 min read

Pilot program includes providing online access to certain public domain texts

Harvard University is embarking on a collaboration with Google that could harness Google’s search technology to provide to both the Harvard community and the larger public a revolutionary new information location tool to find materials available in libraries.

Staff photo illustration Jen Godfrey/Harvard News Office

In the coming months, Google will collaborate with Harvard’s libraries on a pilot project to digitize a substantial number of the 15 million volumes held in the University’s extensive library system. Google will provide online access to the full text of those works that are in the public domain. In related agreements, Google will launch similar projects with Oxford, Stanford, the New York Public Library, and the University of Michigan. An FAQ detailing the Harvard pilot program with Google is available at http://hul.harvard.edu.

The Harvard pilot will provide the information and experience on which the University can base a decision to launch a large-scale digitization program. Any such decision will reflect the fact that Harvard’s library holdings are among the University’s core assets, that the magnitude of those holdings is unique among university libraries anywhere in the world, and that the stewardship of these holdings is of paramount importance. If the pilot is deemed successful, Harvard will explore a long-term program with Google through which the vast majority of the University’s library books would be digitized and included in Google’s searchable database. Google will bear the direct costs of digitization in the pilot project.

By combining the skills and library collections of Harvard University with the innovative search skills and capacity of Google, a long-term program has the potential to create an important public good. According to Harvard President Lawrence H. Summers, “Harvard has the greatest university library in the world. If this experiment is successful, we have the potential to provide the world’s greatest system for dissemination as well.”

In addition, there would be special benefits to the Harvard community. Plans call for the eventual development of a link allowing Google users at Harvard to connect directly to the online HOLLIS (Harvard Online Library Information System) catalog, located at http://holliscatalog.harvard.edu, for information on the location and availability at Harvard of works identified through a Google search. This would merge the search capacity of the Internet with the deep research collections at Harvard into one seamless resource – a development especially important for undergraduates who often see the library and the Internet as alternative and perhaps rival sources of information.

Eventually, Harvard users would benefit from far better access to the 5 million books located at the Harvard Depository (HD). If the University undertakes the long-term program, Harvard users would gain online access to the full text of out-of-copyright books stored at HD. For books still in copyright, Harvard users could gain the ability to search for small snippets of text and, possibly, to view tables of contents. In short, the Harvard student or faculty member would gain some of the advantages of browsing that remote storage of books at HD cannot currently provide.

According to Sidney Verba, Carl H. Pforzheimer University Professor and director of the University Library, “The possibility of a large-scale digitization of Harvard’s library books does not in any way diminish the University’s commitment to the collection and preservation of books as physical objects. The digital copy will not be a substitute for the books themselves. We will continue actively to acquire materials in all formats and we will continue to conserve them. In fact, as part of the pilot we are developing criteria for identifying books that are too fragile for digitizing and for selecting them out of the project.

“It is clear,” Verba continued, “that the new century presents unparalleled challenges and opportunities to Harvard’s libraries. Our pilot program with Google can prove to be a vital and revealing first step in a lengthy and rewarding process that will benefit generations of scholars and others.”

Faculty of Arts and Sciences Dean William C. Kirby commented, “This project has both historic, and immediate, importance. It will at once help preserve our invaluable collections and render them more accessible for scholarship. That we can harness our current technologies for the purpose of tremendous scholarly gain is deeply gratifying. I salute Sid Verba, the Harvard University Library, and Google, who have all helped to make this possible.”

The Harvard University Library, founded in 1638, is the largest academic library system in the world. Harvard’s rich and extensive collections serve as invaluable tools for teaching and research. These collections include books, journals, primary source materials, and audiovisual and digital resources that span a vast range of subjects, languages, and time periods. Through its use of cutting-edge technology, the Harvard University Library is offering online access to an increasing number of library holdings for the benefit of the general public.