STA 290 Seminar: Brian Bullins

seminar thumbnail

Event Date

Location
remotely presented via Zoom

SPEAKER: Brian Bullins, Research Assistant Professor, Toyota Technological Institute at Chicago

TITLE: "Beyond First-Order Methods for Large-Scale Optimization"

ABSTRACT: In recent years, stochastic gradient descent (SGD) has taken center stage for training large-scale models in machine learning. Although methods which go beyond first-order information may achieve better iteration complexity in theory, the per-iteration costs often render them unusable when faced with the current growth in both the available data and the size of the models, particularly when such models now have hundreds of billions of parameters.

In this talk, I will present results, both theoretical and practical, for dealing with two key challenges in this setting, whereby I will show how second-order optimization may be as scalable as first-order methods. First, given the non-convexity of deep neural networks, it has become important to develop a better understanding of non-convex guarantees. Thus, I will present a Hessian-based method which provably converges to first-order critical points faster than gradient descent, alongside guarantees for converging to second-order critical points. In addition, optimization methods which may parallelize have also become increasingly critical when facing enormous deep learning models, and so I will show how we may leverage stochastic second-order information to attain faster methods in the distributed optimization setting.

BIO: Brian Bullins is a research assistant professor at the Toyota Technological Institute at Chicago. He received his Ph.D. in computer science at Princeton University, where he was advised by Elad Hazan, and his research was supported by a Siebel Scholarship. His interests broadly lie in both the theory and practice of optimization for machine learning. In particular, his work on improving matrix estimation techniques has led to new second-order methods for convex and nonconvex optimization with provable guarantees, along with further applications for distributed settings, and his work has received a best paper award at COLT 2021.

 

This seminar will be presented remotely via Zoom. To gain access, please contact Pete Scully ([email protected]) for the link. 

Tags