STA 290 Seminar: Jingfeng Wu

seminar thumbnail

Event Date

Location
Mathematical Sciences Building 1147

Speaker: Jingfeng Wu, Post-Doctoral Fellow, UC Berkeley

Title: "Risk Convergence and Algorithmic Regularization of Discrete-Stepsize (Stochastic) Gradient Descent"

Abstract: Gradient Descent (GD) and Stochastic Gradient Descent (SGD) are fundamental optimization algorithms in machine learning, but their behaviors sometimes defy intuitions from classic optimization and statistical learning theories. In deep learning, GD often exhibits local oscillations while still converging over time. Moreover, SGD-trained models generalize effectively even when overparameterized. In this talk, I will revisit the theories of GD and SGD for classic problems but in new scenarios motivated by deep learning, presenting two novel insights: 

(1) For logistic regression with separable data, GD with an arbitrarily large stepsize minimizes empirical risk, potentially in a non-monotonic fashion. 

(2) For linear regression and ReLU regression, one-pass SGD and its variants can achieve low excess risk, even in overparameterized regime.

 

Faculty webpage (links to Github Site): https://uuujf.github.io/


Seminar Date/Time: Thur, Oct 12, 4:10pm

Location: MSB 1147, Colloquium Room

Refreshments: 3:30pm, MSB 1147 Courtyard

Tags