News+

Research team awarded Schmidt Sciences grant for AI humanities project

Courtesy of Schmidt Sciences

2 min read

A Harvard research team led by Peter K. Bol, Charles H. Carswell Professor of East Asian Languages and Civilizations, has been awarded a $600,000 grant from Schmidt Sciences for a project to develop multilingual AI tools to help scholars access, analyze, and compare Eurasian historical documents.

The team is one of 23 around the world to be awarded funding from Schmidt Sciences for projects that use AI in humanities disciplines. The philanthropic organization, which funds research in science and technology, awarded a total of $11 million in this round of funding.

“We are delighted to be selected for support in this initial round of the Schmidt Science Foundation’s program for exploring how AI can better serve the humanities, and how the humanities can improve AI,” Bol said.

The project, titled “Connectivity and Individuality in Textual Traditions: Augmenting Retrieval for Eurasian Languages,” examines how Eurasian textual traditions spread, change, and compete with each other, and how AI, when trained to understand the relevant languages and history, can help scholars detect those patterns on a large scale while preserving the unique linguistic and historical details. The research team also includes Kianté Brantley, assistant professor of computer science at the Harvard John A. Paulson School of Engineering and Applied Sciences, Yehuda Halper from Bar Ilan University, Unso Jo from Cornell University, Khatchig Mouradian from Columbia University, Sebastian Nehrdich from Tohoku University, and Donald Sturgeon from Durham University.  

The researchers will conduct five case studies on documents ranging from Aristotelian commentaries to midwifery registers, written in eight underserved languages: Greek, Arabic, Hebrew, Armenian, Sanskrit, Chinese, and Tibetan. A computer science team will pre-train small-to-medium multilingual foundation models, build a Graph-RAG pipeline that tracks intertextuality, corpus formation, and concept drift across time and translation, and then will release an evaluation framework that combines expert judgment with quantitative metrics.

The goal is to produce rigorously tested tools that can search across broad historical time periods in low-resource languages, and to produce an AI workflow that keeps scholars involved, aiding traditional humanities research with state-of-the-art multilingual natural-language processing.