Student Seminar Series
DATE: Tuesday, January 29th, 1:00pm
LOCATION: MSB 1147 (Colloquium Room).
SPEAKERS: Lihua Lei, PhD Candidate, Statistics, UC Berkeley
TITLE: “Top-down Hierarchical Clustering”
ABSTRACT: The problem of community detection in networks is usually formulated as finding a single partition of the network into some “correct” number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive bi-partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. Such an algorithm is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where it outperforms K-way spectral clustering, and propose a natural model for analyzing the algorithm’s theoretical performance, the Binary Tree Stochastic Block Model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. Our technique exploits the recent development of eigenvector perturbation theory for random matrices. Our numerical results confirm and complement our theory. This talk features joint works with Tianxi Li, Sharmodeep Bhattacharyya, Purnamrita Sarkar, Peter J. Bickel, and Elizaveta Levina.