Statistics Seminar: Rae Yu

Statistics Seminar

Event Date

Location
Mathematical Sciences Building 1147

Speaker: Rae Yu, PhD Candidate, Princeton University

Title: "Statistical Limits of Causal Trees for Heterogeneous Treatment Effect Estimation"

Abstract: Machine learning methods have become increasingly popular tools for causal inference because of their ability to adapt to complex, high-dimensional data. Yet this flexibility comes at a cost: their statistical properties are often poorly understood. Among these methods, recursive decision trees have emerged as a leading approach for the estimation and inference of heterogeneous treatment effects in both experimental and observational studies. Built on CART-type algorithms, they not only adapt to complex structure in the data, but also offer interpretability, since the induced partitions can be viewed as defining meaningful treatment subgroups.

I analyze several prominent variants of causal tree estimators and study the limits of their statistical accuracy. Under mild conditions, I show that these estimators cannot achieve a polynomial-in-sample-size uniform convergence rate, as measured by worst-case error over the covariate space. In contrast, the same procedures achieve near-optimal average convergence rates, as measured by integrated mean squared error. Moreover, data splitting—commonly used to control overfitting and enable valid inference—yields at most negligible logarithmic improvements in either sample size or dimension. As a result, strong average performance can mask substantial worst-case instability, and calls for a systematic treatment of regularization. The theoretical insights are empirically validated through simulations.

I conclude by situating these results within my broader research agenda on developing rigorous theoretical foundations for causal inference in complex observational and experimental settings, including settings with spatial structure and network spillovers.

This talk is part of the STA 290 Seminar series.