Harvard’s new Data Science Initiative had its public kickoff this week, bringing together the University’s practitioners in the field for the first of a planned series of seminars focusing on the best ways to handle and analyze data, including the enormous sets of it now available.

The event at Harvard Law School’s Austin Hall was the first of the initiative’s “45 + 45 seminars” featuring a pair of 45-minute talks. Cynthia Dwork, Gordon McKay Professor of Computer Science and Radcliffe Alumnae Professor at the Radcliffe Institute for Advanced Study, talked about the methodology around privacy and about keeping large data sets — of patient information, for example — from prying eyes.

Dwork, an expert in cryptography, discussed several aspects of safeguarding information, including the role of “differential privacy,” which she invented in 2006 and which has been deployed as part of Apple’s IOS 10 operating system and Google’s Chrome browser.

“It’s an honor to be here and an honor to help launch this initiative,” Dwork said.

The other speaker was Isaac Kohane, head of Harvard Medical School’s Department of Biomedical Informatics and the Marion V. Nelson Professor of Biomedical Informatics. Kohane presented case studies in which big data provided insight to conditions such as autism, adverse drug reactions, and antibiotics resistance.

“Medicine … takes care of you and your loved ones, and does it in a way that is not informed by data science, [and that is] is something that should concern you greatly,” said Isaac Kohane. Jon Chase/Harvard Staff Photographer
The medical industry, Kohane said, doesn’t utilize the vast amount of data it routinely gathers at anywhere close to its potential. That results in missed opportunities for cost savings in an industry that accounts for 15 percent of the U.S. gross domestic product, and also in realities like doctors not understanding the damage a drug such as Vioxx — a painkiller withdrawn from the market in 2004 because it increased risks of heart attack and stroke — can do until it’s too late for thousands of patients.

“Medicine … takes care of you and your loved ones, and does it in a way that is not informed by data science, [and that is] is something that should concern you greatly,” Kohane told the gathering of about 80. He added that he’s encouraged that medical students seem enthusiastic about data science. “For me, this is the way we’re going to change medicine.”

Richard McCullough, Harvard’s vice provost for research who led planning for the initiative, said the initiative was two years in the making and resulted from consultations with a large number of faculty members and deans. The initiative’s role, McCullough said, is to bring faculty members together from across campus “in a way that makes us stronger than the sum of our parts.”

“The greatest thing about the initiative is there’s been overwhelming support from the faculty and deans to push this forward,” McCullough said.

Harvard Provost Alan Garber said he has great hopes for the initiative, saying there are unprecedented amounts of data available in a range of fields awaiting analysis. Garber said he is excited about the initiative’s potential because of the breadth of faculty expertise.

“I think that if we’re successful in this regard, the students who don’t think of themselves as data scientists today will actually have their professional lives transformed by this initiative because I believe very strongly that this initiative and the adoption of the techniques and ways of thinking of data science will be transforming virtually every field of inquiry in the University,” Garber said. “Certainly, in the life sciences as well as other areas, students with a deep background in data science will be best equipped to lead advances in their specific areas of application.”

Cryptography expert Cynthia Dwork discussed several aspects of safeguarding information, including the role of “differential privacy,” which she invented in 2006. Jon Chase/Harvard Staff Photographer

The initiative is an effort to support and build community around Harvard researchers’ efforts to generate, handle, and analyze data, including the big data that has emerged as a force in scientific fields and is now changing the social sciences and humanities. Established as a University-wide initiative in March, the project is initially supported by Harvard’s Provost’s Office, and the Elsevier and Microsoft companies.

The initiative first announced the appointment of eight data science postdoctoral fellows who will be working at Harvard this year, as well as research grants to five faculty members for data science-related projects. Early support for the postdoctoral fellows came in part from the Alfred P. Sloan Foundation’s Economics Program.

The initiative is co-directed by David Parkes, George F. Colony Professor of Computer Science at the Harvard Paulson School of Engineering and Applied Sciences, and Francesca Dominici, professor of biostatistics at the Harvard T.H. Chan School of Public Health.

Parkes and Dominici offered opening remarks at the event. Dominici outlined the initiative’s “bold vision” to help faculty members working in this area advance their research and become a unifying force across the University. “We really hope we can work across departmental boundaries and School boundaries,” Dominici said. “This is really an opportunity to empower the faculty, whatever research they are doing, to make your department better and your School better.”

Parkes said it’s important that Harvard, as a leading educational institution, be involved in this area and that engagement from faculty across the University will be important for the initiative’s success. “We see it as an intellectual imperative that Harvard be leading in data science,” Parkes said. “With a University of this breadth … we have to be doing this.”

Harvard in Allston: Data science

In this episode, we talk to Harvard professors David Parkes and Francesca Dominici about Harvard’s new data science initiative. Parkes and Dominici discuss data science as an emerging discipline and how it unites efforts in fields such as computer science, education, medicine, and government and policy.