Statistics Seminar: STA 290
Thursday, May 24th, 2012 at 4:10pm, MSB 1147 (Colloquium Room)
Refreshments at 3:30pm prior to seminar in MSB 4110 (Statistics Lounge)
Speaker: Hung Chen (National Taiwan University, Taiwan)
Title: On effectiveness of k-means clustering of functional data with marginal covariance matrix
Abstract: Organizing functional data into sensible groupings is one of the most fundamental modes of understanding and learning the underlying mechanism generating functional data. Recent developments in many scientific fields, including biology, economics, and signal processing, have produced large number of huge collections of functional data. Those measurements are often intricate mixtures of the initial signal sources of interest. Clustering analysis coupled with PCA is employed to search for homogeneous subgroups of individuals and identify mixed sources.
In this talk, we will address the following question raised in the literature. Since PCA is used to capture the directions of greatest variability in the data, this variability is not necessarily to reflect the variability of between hidden clusters or transforming the data into principal components may obscure rather than revealing group of interest as in Chang (1983, Applied Statistics), Yeung and Ruzzo (2001, Bioinformatics), and Kettenring (2006, Journal of Classification).
Conditions in terms of eigenanalysis of between group variation and eigenanalysis of combined group within-cluster covariance structure will be given to ensure the feasibility of straightforward eigenanalysis with k-means clustering procedure can be counted on to reveal cluster structure for latent variable models. In addition, the limitation on discovering hidden cluster structure based on marginal covariance function will also be addressed.