STA 290 Seminar: Martin Wainwright (UC Berkeley)

Statistics Seminar: STA 290

Thursday, February 23rd, 2012 at 4.10pm, MSB 1147 (Colloquium Room)

Refreshments: 3.30pm, MSB 1147 (Colloquium Room)

Speaker:  Martin Wainwright (UC Berkeley)

Title:       Sparse and smooth: an optimal convex relaxation for high-dimensional kernel regression

Abstract:  The problem of non-parametric regression is well-known to suffer from a severe curse of dimensionality, in that the required sample size grows exponentially with the dimension $d$.  Consequently, the success of statistical estimation in high dimensions relies on some kind of low-dimensional structure.  This talk focuses on non-parametric estimation within the family of sparse additive models, which consist of sums of univariate functions over $s$ unknown

We derive a simple and intuitive convex relaxation for estimating sparse additive models described by reproducing kernel Hilbert spaces, including polynomial fits, splines, Sobolev classes as instances.  The method involves solving a second-order cone program (SOCP), and so is suitable for large-scale problems.  Working within a high-dimensional framework that allows both the dimension $d$ and sparsity $s$ to scale, we derive convergence rates that consist of two terms: a \emph{subset selection term} that captures the difficulty of finding the unknown $s$-sized subset, and an \emph{estimation error} that captures the difficulty of estimation over kernel classes.  Using information-theoretic methods, we derive
matching lower bounds on the minimax risk, showing that the SOCP-based method is optimal.

Based on joint work with Garvesh Raskutti and Bin Yu, UC Berkeley Arxiv paper: