The Shumway Lectures: Norman Breslow
Thursday, April 15th, 2010
Mathematical Sciences Building 1147 (Colloquium Room)
Speaker: Professor Norman Breslow (Department of Biostatistics, University of Washington, Seattle)
Title: Making better use of case control samples from a large cohort
Abstract: A common study design in epidemiology involves two-phase stratified sampling, on the basis of outcomes and other variables, from a large cohort. Additional covariate information, which often requires expensive bioassay of stored tissue, is collected only for the subsample. Targets of inference are parameters in (semi)parametric regression models that would be fitted to the main cohort were complete data available for everyone. The cohort (phase one sample) is regarded as a simple random sample from an infinite superpopulation (model) while the phase two sample is obtained by finite population stratified sampling from the cohort.
Options for parameter estimation include the standard sample survey Horvitz-Thompson (HT) estimator, a pseudolikelihood (PL) estimator based on ordinary likelihood scores corrected for the biased sampling and restricted, nonparametric maximum likelihood (ML) that involves profiling over the unknown covariate distribution restricted to the observed values. The variance of the HT estimator is the sum of two components: the model-based variance of the MLE that would be calculated from complete data for the entire cohort; and the design-based variance from HT estimation of the unknown cohort total of the (efficient) influence function (IF) contributions. The second component may be reduced by adjusting the sampling weights, e.g., by their calibration to known cohort totals of auxiliary variables correlated with the IF or by their estimation using these same covariates. This talk presents results from two recent papers (Am J Epidemiol 169:1398-405, 2009; Statist Biosci 1:32-49, 2009) that illustrate these improvements for "case-cohort" analyses based on the Cox proportional hazards model.
To see a video of this lecture, please visit:
(note: Flash video player)
Opinions and commentary are those of the speaker and do not represent the Department of Statistics or the Regents of the University of California.