Event Date
Speaker: Min-ge Xie, Distinguished Professor, Department of Statistics, Rutgers University
Title: "Fisher Inversion, Repro Samples Method, and Principled Random Forests"
Abstract: Rapid data science developments and the desire to have interpretable AI require us to have innovative frameworks to tackle frequently seen, but highly non-trivial inference problems, e.g., those involving discrete or non-numerical parameters and those involving non-numerical data, etc. This talk presents an effective and wide-reaching framework, called repro samples method, to conduct statistical inference for these problems and more. We develop both theories to support our development and provide effective computing algorithms for problems in which explicit solutions are unavailable. A particular development is on a commonly encountered non-trivial inference problem that involve both discrete/nonnumerical and continuous parameters. We propose an effective two-step procedure to make inferences for all parameters and use Fisher inversion method to develop a unique matching scheme that turns the disadvantage of lacking theoretical tools to handle discrete/nonnumerical parameters into an advantage of improving computational efficiency. The effectiveness of the method is illustrated through a case study example of developing a novel machine learning ensemble tree model, called principled random forests. Specifically, we first construct a confidence set for the underlying (‘true’) tree model that generated (or approximately generated) the observed data. We then obtain a tree ensemble model using the confidence set, from which we derive our inference. The development is principled and interpretable since, firstly, it is fully theoretically supported and provides frequentist performance guarantees on both inference and predictions; and secondly, the approach only assembles a small number of trees in the confidence set and thereby the model used is more interpretable. The development is further extended to handle tree-structured conditional average treatment effect in a causal inference setting. Numerical results demonstrate superior performance of our proposed approach than existing single and ensemble tree methods.
Fisher inversion and repro samples method provide us a new toolset for developing interpretable AI and for helping address the blackbox issues in complex machine learning models. The development of the principle random forest is our first attempt on this direction.
Bio: Min-ge Xie, PhD is a Distinguished Professor at Rutgers, The State University of New Jersey. Dr. Xie received his PhD in Statistics from University of Illinois at Urbana-Champaign and his BS in Mathematics from University of Science and Technology of China. He is the current Editor of The American Statistician and a co-founding Editor-in-Chief of The New England Journal of Statistics in Data Science. He is a fellow of ASA, IMS, and an elected member of ISI. His research interests include theoretical foundations of statistical inference and data science, fusion learning, finite and large sample theories, parametric and nonparametric methods. He is the Director of the Rutgers Office of Statistical Consulting and has a rich interdisciplinary research experiences in collaborating with computer scientists, engineers, biomedical researchers, and scientists in other fields.
Faculty website (links to Rutgers University): https://statweb.rutgers.edu/mxie/
Seminar Date/Time: Thursday October 17, 2024, at 4:10pm
Location: MSB 1147 (Refreshments at 3:30pm, MSB courtyard)