A statistician and a computer scientist have been named co-leaders of Harvard’s new Data Science Initiative, the Harvard University Office of the Vice Provost for Research announced today.

A University-wide program that will aid cross-disciplinary collaboration, the initiative will be led by Francesca Dominici, professor of biostatistics at the Harvard T.H. Chan School of Public Health, and David C. Parkes, George F. Colony Professor and area dean for computer science at the Harvard John A. Paulson School of Engineering and Applied Sciences.

“With its diversity of disciplines, Harvard has access to large data sets that record a staggering array of phenomena,” said Provost Alan Garber. “Researchers in our Schools of medicine, public health, business, law, arts and sciences, government, education, and engineering are already gaining deep insights from their analyses of large data sets. This initiative will connect research efforts across the University in this emerging field. It will facilitate cross-fertilization in both teaching and research, paving the way to methodological innovations and to applications of these new tools to a wide range of societal and scientific challenges.”

As massive amounts of data are generated from science, engineering, social sciences, and medicine — and even from digitally augmented daily lives — researchers are grappling with how to make sense of all this information, and how to use it to benefit people. Data science applies the theory and practice of statistics and computer science to extract useful knowledge from complex and often messy information sources. Applications span health care, the environment, commerce, government services, urban planning, and finance. The initiative will make it possible to take methodology and tools from one domain to another and discover new applications.

Data science for a new era

Until now, Harvard’s growth in data science has been organic, occurring in distinct domains and an increasing array of applications. The initiative will unite efforts. A steering committee led the planning, involving 55 faculty members and many of Harvard’s data science leaders.

The initiative already has launched the Harvard Data Science Postdoctoral Fellowship program, which will support up to seven scholars over two years, whose interests are in data science, broadly construed, and include researchers with a methodological and applications focus.

The first cohort of fellows will arrive in the fall; they will direct their own research while forging collaborations around the University. The program will offer numerous opportunities to engage with the broader data science community through events such as seminar series, informal lunches, mentoring, and fellow-led and other networking opportunities.

The initiative has also launched the Harvard Data Science Initiative Competitive Research Fund, which invites innovative ideas from those with interests that span data science, including methodological foundations and the development of quantitative methods and tools motivated by application challenges.

In addition, three master’s degree programs have been approved. The Medical School offers a master’s degree in biomedical informatics, and the Harvard Chan School has a master’s of science in health data science. A master’s in data science (Faculty of Arts and Sciences) and jointly offered by Computer Science and Statistics is planned for the fall of 2018.

“The ability to apply the power of new analytics and new methodologies in revolutionary ways makes this the era of data science, and Harvard faculty have been at the forefront of this emerging field,” said Vice Provost for Research Rick McCullough. “Our researchers not only develop new methodologies, but also apply those methodologies to incredible effect. I am delighted that Francesca Dominici and David Parkes will be co-directing this new effort. They are both extraordinary scientists and exemplary colleagues.”

Dominici specializes in developing statistical methods to analyze large and complex data sets. She leads multiple interdisciplinary groups of scientists addressing questions in environmental health science, climate change, and health policy.

“Harvard’s Data Science Initiative will build on the collaborations that already exist across the University to foster a rich and cohesive data science community that brings together scholars from across disciplines and schools,” Dominici said. “I am delighted to be a part of an effort that pushes the frontiers of this important discipline and extends our ability to use data science for the good of people everywhere.”

Parkes leads research at the interface between economics and computer science, with a focus on multi-agent systems, artificial intelligence, and machine learning.

“The Data Science Initiative will strengthen the fabric of connections among departments to create an integrated data science community,” Parkes said. “Through these efforts, we seek to empower research progress and education across the University, and work toward solutions for the world’s most important challenges. I look forward to being a part of this exciting work.”

The Data Science Steering Committee, in addition to Dominici and Parkes, includes:

  • Alyssa Goodman, professor of applied astronomy, Faculty of Arts and Sciences
  • Gary King, director, Harvard Institute for Quantitative Social Science;
  • Zak Kohane, chair of the Department of Biomedical Informatics, Harvard Medical School;
  • Xihong Lin, chair of the Department of Biostatistics, Harvard Chan School;
  • Anne Margulies, University chief information officer;
  • Hanspeter Pfister, professor of computer science, Harvard Paulson School;
  • Neil Shephard, chair of the Department of Economics and of Statistics, Faculty of Arts and Sciences.

For more information about the initiative, visit datascience.harvard.edu.