Causal Seminar: Mike Baiocchi, Stanford University

Hawes Hall, Classroom 102, Harvard Business School

How to tell the difference between machine learning and (bio)statistics

We’ll start this talk discussing a couple of studies: (i) a randomized trial to evaluate a sexual assault prevention program in Nairobi, Kenya and (ii) a remote detection operation to find and disrupt labor trafficking in the Amazon rainforest. These are both “data science” projects but they are wildly different in how they work. What makes them so different? For a long time in (bio)statistics we only had two fundamental ways of reasoning using data: warranted reasoning (e.g., randomized trials) and model reasoning (e.g., linear models). In the 1980s a new, extraordinarily productive way of reasoning about algorithms emerged: “outcome reasoning.” Outcome reasoning has come to dominate areas of data science, but it has been under-discussed and its impact under-appreciated. For example, it is the primary way we reason about “black box” algorithms.

In this talk we will discuss its current use (i.e., as “the common task framework”) and its limitations. We will show why we find a large class of prediction-problems are inappropriate for this new type of reasoning. We will then discuss a way to extend this type of reasoning for use, where appropriate, in assessing algorithms for deployment (i.e., when using a predictive algorithm “in the real world”). We purposefully developed this new framework so both technical and non-technical people can discuss and identify key features of their prediction problem.

Headshot of Mike Baiocchi.

Mike Baiocchi

Associate Professor

Stanford University
Department of Epidemiology and Population Health,

Optional pre-reading:

Co-speaker:

  • Jordan Rodu, Assistant Professor, Department of Statistics, University of Virginia

Discussant:

  • Matthew Blackwell, Associate Professor of Government, Harvard University Faculty of Arts & Sciences; Faculty Affiliate, Institute for Quantitative Social Science at Harvard University

Moderator:

  • Luke W. Miratrix, Associate Professor of Education, Harvard Graduate School of Education; Co-Faculty Director, Doctor of Philosophy in Education Program at the Harvard Graduate School of Education