Science & Tech

Humanities: From deconstruction to digitization

5 min read

The humanities is catching other disciplines, technologically speaking

The “analytical engine” first proposed by British mathematician Charles Babbage in 1837 would have been the world’s first computer.

It was never built, but the notion of a programmable computing machine captivated the poet Lord Byron’s daughter, who in 1843 championed the idea. Today, Augusta Ada Byron King, Countess of Lovelace, is known as “the first programmer” for her enthusiastic description of a technology that would one day change the world.

By 1890 German-designed tabulation machines were employed to assemble and analyze data for the U.S. Census. In the years that followed, tabulation machines had little impact on literary scholarship, but they did revolutionize the way retails stores and railways kept track of data, explained linguistics and computer scholar Malcolm Hyman in a Harvard Humanities Center lecture last week (April 17).

Hyman, a research associate at the Max Planck Institute for the History of Science in Berlin, addressed a group of 20 listeners at the Barker Center about the theoretical challenges ahead for humanities computing — a fast-growing corner of scholarship in the classics, modern literature, and the arts that looks to computer science for analytical help.

From 2001 to 2004, Hyman was a research associate in Harvard’s Department of Classics, where he helped create key software for the Archimedes Project, an international initiative funded by the National Science Foundation and associated with the Max Planck Institute.

It’s intended to create interactive environments for scholars researching early concepts in mechanics and engineering. Computers allow source documents in Latin, Greek, Arabic, and Italian to be cross-referenced in a way that reveals the evolution of concepts — and novel glimpses into the sophistication of ancient science.

Hyman was introduced by Archimedes Project Director Mark Schiefsky, professor of the classics at Harvard and an expert on ancient medicine and mechanics.

Getting the tools of computer-aided research out to scholars in the humanities, said Schiefsky later, “is a critical problem” — and one that could be solved in part by involving postdoctoral students in the process.

At the Max Planck Institute, Hyman has delivered presentations imagining a future in which computers, the Internet, and the World Wide Web open up “a virtual observatory for the humanities” for scholars thousands of miles apart.

Conventional ways of research lack “intellectual mobility,” Hyman has said. They are tied to a conventional, linear system of scholarship: Research moves from peer review, to book, to library shelf, and then to slow retrieval through bibliographies.

Better would be a “scholarly collaboratory,” he suggested — an open, accessible federation of intellectual property and data that creates an interactive universe of knowledge on the Web.

Compared with engineering, economics, business, political management, and the natural sciences, said Hyman, the humanities got a slow and late start in taking advantage of computer technology.

Beyond a short-lived Shakespeare tabulation project in 1901, using early computer technology for humanities scholarship didn’t start until 1949, said Hyman, when Italian Jesuit priest Roberto Busa used a combination of IBM punch cards and clerical labor to index the 11 million or more words of Medieval Latin in the works of St. Thomas Aquinas and related authors.

On a parallel track, starting in 1942, were efforts to use tabulation devices for machine translation. A 1954 experiment — hailed as a success — used IBM technology to translate 49 test sentences from Russian to English. (More than 50 years later, Hyman cautioned, “the problem [of machine translation] still remains far from solved.”)

Meanwhile, statisticians began experimenting with computers to investigate disputed authorship. The most famous early study of that type came in 1964, when Harvard’s Frederick Mosteller and a colleague used statistical inference to determine that 12 disputed Federalist Papers were written by James Madison.

Soon after, humanities scholars moved from simple tabulation to more complex questions of literary and linguistic analysis, said Hyman. To draw a line between eras, he used a 1987 study of Jane Austen by J.F. Burrows, “Computation into Criticism.”

Since then, the confluence of computer science and humanities scholarship has accelerated. One proof is the proliferation of groups like Harvard’s Archimedes Project, which includes links to the Perseus Digital Library at Tufts University.

Many other societies, courses, centers, and projects have appeared, including the Humanist, an international electronic seminar on humanities computing; the Association for Computers and the Humanities; the University of Nebraska-Lincoln’s Center for Digital Research in the Humanities; and the seminal journal Computers and the Humanities, founded in 1966.

“The time has come,” said Hyman, “for us to consider how we might use computers for analysis proper in the humanities.”

He cited nine contemporary research projects that illustrate innovative uses of information technology in humanities research.

In one, computers search for textual patterns in archaic documents, including early cuneiform texts. Another project investigates the evolution of historical languages using phylogenetic analysis, a technique biologists employ to study the evolution of molecules.

DocuScope, another tool described by Hyman, allows a computer user to search for representational patterns in texts, which are then displayed in color-coded “linguistical strings.”

Computer-aided digital document image enhancement allows scholars to create sharper images from documents degraded by time, or from old photographs of missing documents.

There are computer programs that analyze handwriting. Others set up virtual archives based on high-resolution scans of documents and artifacts.

Hyman’s last example was the digital Sanskrit Library project at Brown University, where he is a project adviser.

What lies ahead? Hyman asked. Today we are in a “transitional phase” of building an infrastructure for what he called “cyberscholarship.” For one, vast amounts of material still have to be digitized, and the quality of digitization so far is uneven.

But as cyberscholarship catches on in the humanities, “text mining” will improve, along with opportunities for collaborative scholarship. New publication formats will emerge, including living review journals, an online format already used in the natural sciences.

“Such tools change our work,” said Hyman, “and change us.”