Telling apples from Apples

Students using card catalogs at Widener Library, 1945.
Courtesy of Harvard University Archives
Harvard Library search tool will understand intent behind the terms
In the 50 years since card catalogs moved online, the way we search for library materials has stayed much the same. Users enter keywords into a search, the system looks for those keywords and returns results.
As collections and data have grown exponentially, it’s become more complicated to finetune for the right results. If you search a library catalog for “the history of Apple,” you’ll get results mainly for the fruit rather than the company. The system only understands the words, not the meaning.
A Harvard Library team is building a new search tool to change that.
Using generative artificial intelligence and semantic search technologies, its new Collections Explorer will break through the limitations of keyword search to decipher the intent behind your words. It will allow you to ask questions and carry out your search in natural language.
What poems of Emily Dickinson’s include handwritten marginalia? What does Harvard have on the history of germ theory development? Tell me about the Black empowerment movement in America.
Imagine asking any of these questions, exactly as they are worded, on the library’s website and getting the results you’re really looking for. Soon, you can.
Pioneering a new model of search and discovery
With more than 20 million physical and digital items in dozens of formats — from ancient manuscripts to journal articles to one-of-a-kind maps and original poetry recordings — finding the right item for your research in Harvard’s vast collections is a complex endeavor.
For librarians and technologists at the library, the rise of generative AI presents an opportunity to tackle this problem while challenging conventional thinking about traditional library search.
Martha Whitehead, University librarian and vice president for Harvard Library, recognized that library searches needed to evolve, and she charged her team with finding a way to incorporate AI into search.
“How can Harvard Library model what is possible in this brave new world of library discovery enabled and enhanced by AI?” she asked.

Collections Explorer is slated to launch publicly in the fall.
Photo by Scott Murry
Partnering with Mozilla.ai, the nonprofit’s division dedicated to open-source and trustworthy AI, a Harvard Library team lead by Stu Snydman, associate University librarian and managing director of Library Technology Services, got to work.
“Keyword search is now 50 years old. With our new discovery system, we demonstrate how recent generative AI technologies, such as large language models (LLMs), can intersect with established AI technologies to create a powerful tool for finding and discovering information,” Snydman said.
Harvard Library’s three-month partnership with Mozilla.ai led to a prototype for a new AI-driven search tool, Collections Explorer. Built by Library Technology Services, the tool uses generative AI to search across repositories and collection formats. Its alpha release, which just completed user testing, is slated to launch publicly in the fall.
Using Collections Explorer
The Collections Explorer is intuitive and transparent. Suppose you’re curious about Chinese artwork at Harvard. You can type in your question — “Does Harvard hold any artwork from China?” — as if you’re talking to a librarian.
Along with results from the Harvard Art Museums’ archives and collections of Chinese calligraphy and painting, you’ll also see illustrations of Chinese plants from a global botanical illustration collection. The results include explanations of why they’re a match for your prompt.
The Explorer also suggests additional prompts, such as “Notable Chinese paintings and sculpture at Harvard University” or “Exploring the Chinese art treasures housed at Harvard.” Serendipity and creativity are built into the system.
Ask the tool “What does Harvard have on the history of germ theory development?” and along with results from the library’s collections, the system suggests you try: “How did public education campaigns of the 19th century intersect with germ theory?” or “What are techniques for antiseptic surgery?” Each new prompt opens a new possibility for inquiry.
“With Collections Explorer, our new discovery system for the AI age, Harvard Library meets the needs of its community and the public in new and innovative ways,” Whitehead said. “We look forward to the next 50 years.”