Math-Stat Colloquium: Michael I. Jordan (UC Berkeley)


Department of Statistics
Department of Mathematics
University of California, Davis

Wednesday, April 4th, 2012, 4:10pm in MSB 1147 (Colloquium Room)
Refreshments at 3:30pm in Colloquium Room prior to the talk

Speaker: Michael I. Jordan (University of California, Berkeley)

Title: Statistics and Computation in the Age of Massive Data

Abstract: There are many issues remaining to be addressed, or even formulated, at the interface of statistics and computation. One way to capture the current state of affairs is the following: If we view data as a resource, how can it be that in many practical problems of interest we find ourselves uncomfortable at being given too much data? The issue is both statistical and computational---on a fixed computational budget we are unable to guarantee that the statistical risk decreases as the number of data points grows (without bound). A general theory not yet being available, I present two initial forays into the problem domain. The first is an exploration of the bootstrap in the regime of very large data sets, where it is computationally infeasible to obtain bootstrap resamples. I present a new procedure, the ``bag of little bootstraps,'' which inherits the favorable theoretical properties of the bootstrap but is also scalable. The second is an exploration of divide-and-conquer strategies for matrix completion. Here the theoretical support is provided by concentration theorems for random matrices, and I present a new approach to this problem based on Stein's method. [Joint work with Ariel Kleiner, Lester Mackey, Purna Sarkar, Ameet Talwalkar, Richard Chen, Brendan Farrell and Joel Tropp].