Statistics Seminar: STA 290
Thursday, February 23rd, 2012 at 4.10pm, MSB 1147 (Colloquium Room)
Refreshments: 3.30pm, MSB 1147 (Colloquium Room)
Speaker: Martin Wainwright (UC Berkeley)
Title: Sparse and smooth: an optimal convex relaxation for high-dimensional kernel regression
Abstract: The problem of non-parametric regression is well-known to suffer from a severe curse of dimensionality, in that the required sample size grows exponentially with the dimension $d$. Consequently, the success of statistical estimation in high dimensions relies on some kind of low-dimensional structure. This talk focuses on non-parametric estimation within the family of sparse additive models, which consist of sums of univariate functions over $s$ unknown
co-ordinates.
We derive a simple and intuitive convex relaxation for estimating sparse additive models described by reproducing kernel Hilbert spaces, including polynomial fits, splines, Sobolev classes as instances. The method involves solving a second-order cone program (SOCP), and so is suitable for large-scale problems. Working within a high-dimensional framework that allows both the dimension $d$ and sparsity $s$ to scale, we derive convergence rates that consist of two terms: a \emph{subset selection term} that captures the difficulty of finding the unknown $s$-sized subset, and an \emph{estimation error} that captures the difficulty of estimation over kernel classes. Using information-theoretic methods, we derive
matching lower bounds on the minimax risk, showing that the SOCP-based method is optimal.
Based on joint work with Garvesh Raskutti and Bin Yu, UC Berkeley Arxiv paper: http://arxiv.org/abs/1008.3654