STA 290 Seminar Series
DATE: Thursday February 22nd, 4:10pm
LOCATION: MSB 1147, Colloquium Room
Refreshments 3:30pm, MSB 4110
SPEAKER: Tengyu Ma, Facebook / Computer Science and Statistics, Stanford University
TITLE: “Algorithmic Regularization in Over-parameterized Matrix Recovery and Neural Networks with Quadratic Activations”
Over-parameterized models are widely and successfully used in deep learning, but their workings are far from understood. In many practical scenarios, the learned model generalizes to the test data, even though the hypothesis class contains a model that completely overfits and no regularization is applied.
In this talk, we will show that such phenomenon occurs in over-parameterized matrix recovery models as well, and prove that the gradient descent algorithm provides additional regularization power that prevents the overfitting. The result can be extended to learning one-hidden-layer neural networks with quadratic activations. The key insight here is that gradient descent prefers to search through the set of low complexity (that is, low-rank) models first, and converges to a low complexity model with a good training error if such a model exists.
Based on joint work with Yuanzhi Li and Hongyang Zhang. https://arxiv.org/abs/1712.09203