Event Date
Speaker: Jingfeng Wu, Post-Doctoral Fellow, UC Berkeley
Title: "Risk Convergence and Algorithmic Regularization of Discrete-Stepsize (Stochastic) Gradient Descent"
Abstract: Gradient Descent (GD) and Stochastic Gradient Descent (SGD) are fundamental optimization algorithms in machine learning, but their behaviors sometimes defy intuitions from classic optimization and statistical learning theories. In deep learning, GD often exhibits local oscillations while still converging over time. Moreover, SGD-trained models generalize effectively even when overparameterized. In this talk, I will revisit the theories of GD and SGD for classic problems but in new scenarios motivated by deep learning, presenting two novel insights:
(1) For logistic regression with separable data, GD with an arbitrarily large stepsize minimizes empirical risk, potentially in a non-monotonic fashion.
(2) For linear regression and ReLU regression, one-pass SGD and its variants can achieve low excess risk, even in overparameterized regime.
Faculty webpage (links to Github Site): https://uuujf.github.io/
Seminar Date/Time: Thur, Oct 12, 4:10pm
Location: MSB 1147, Colloquium Room
Refreshments: 3:30pm, MSB 1147 Courtyard