Event Date
Event Date
Location
Mathematical Sciences Building 1147
Speaker: Sewon Min (Assistant Professor, Electrical Engineering and Computer Sciences, UC Berkeley)
Title: Rethinking Modularity and Abstraction in LLMs
Abstract: Today's LLMs are powerful, but I argue in this talk that they are still flawed in two ways. First, they are deployed as monolithic systems: even narrowly scoped tasks require a massive full model. Second, they are not native enough: in fact, text abstractions themselves may be unnecessary. In this talk, I will present two recent works that address these issues.
First, we focus on mixture-of-experts (MoE) models, a dominant architecture in LLMs. While MoEs appear to be modular, we show that in practice they are not: restricting inference to a subset of experts causes severe degradation, and this is intrinsic to how they are trained. We show, however, that it is possible to train an MoE such that modularity emerges naturally, without imposing human priors. Our model, EMO, enables selective use of expert subsets---down to 12.5% with minimal performance loss---while naturally organizing experts by domain.
In the second part, I argue for removing text abstractions altogether: humans perceive the world visually, and models should operate directly in pixel space. While ambitious, recent advances in VLMs make this increasingly feasible. I will present PixelRAG, a retrieval-augmented generation model that retrieves web information directly in pixel space. By eliminating complex and lossy HTML parsing, PixelRAG simplifies the pipeline while outperforming text-based RAG, even on text-centric benchmarks like SimpleQA and NQ, and also introduces a new efficiency lever through image compression.
Bio: Sewon Min is an Assistant Professor in EECS at UC Berkeley, affiliated with Berkeley AI Research (BAIR), and a Research Scientist at the Allen Institute for AI. Her research focuses on understanding and advancing large language models (LLMs), with the goal of improving their performance, flexibility, adaptability, factuality, and reasoning through new architectures and training methods. She also develops tools and infrastructure for data and model auditing. Her work has received multiple best paper awards, dissertation awards from ACM, ACL, and AAAI, and several fellowships. She earned her Ph.D. from the University of Washington and has held research positions at Meta AI, Google, and Salesforce.
Faculty website (links to Berkeley).