Causal Inference for Machine Learning

Current approaches for causal inference, including emerging methodologies that combine causal and machine learning methods, still face fundamental methodological challenges that prevent widespread application. These challenges are often connected with the nature of the data that are analyzed. At their core, data from randomized and observational studies can be large, unstructured, measured imperfectly, and combined from a variety of sources.

Existing causal inference methods usually address the oversimplified situation of estimating causal effects of a single binary treatment for independent observations, for example if a patient received an intervention or not. Real world circumstances are rarely this simple. For instance, consider the question of how air pollution affects peoples’ health. How do we deal with the non-random distribution of pollutants, differing levels of exposure, and variance across time and space?

With support from the Alfred P. Sloan Foundation, the Harvard Data Science Initiative is convening a three-part project on Causal Inference for Complex Treatment Regimes. Over the next two years, this project will take aim at three specific roadblocks that stand between prediction and causation --

  • Experimental Design: Improving experimental design for high-dimensional settings
  • Spillover Effects: Identifying and estimating spillover effects of space and time
  • Addressing Heterogeneity: Characterizing the different causal effects of multivalued interventions across subpopulations

See our original announcement here.


As software tools and other resources become available from the project team, we will post them here.


As part of this work, the HDSI will be convening workshops and tutorials. More information coming soon.


Francesca Dominici, Clarence James Gamble Professor of Biostatistics, Population and Data Science, Harvard T.H. Chan School of Public Health; Co-Director, Harvard Data Science Initiative
Principal Investigator

Kosuke Imai, Professor of Government and of Statistics, Harvard Faculty of Arts and Sciences

Jose Zubizarreta, Associate Professor of Health Care Policy, Harvard Medical School

Falco Bargagli Stoffi
Postdoctoral Fellow

Ambarish Chattopadhayy
Doctoral Candidate

Karissa Huang
Undergraduate Student

Cory McCartan
Doctoral Candidate

Nisha Puri
Project Manager

Xiao Wu
Doctoral Candidate


David Cutler
Otto Eckstein Professor of Applied Economics
Harvard University

Anup Malani
Lee and Brena Freeman Professor
University of Chicago Law School

Carl Morris
Emeritus Professor of Statistics
Harvard University

Joseph Newhouse
John D. MacArthur Professor of Health Policy and Management
Harvard University