STA 290 Seminar: Bruce Rannala (Genome Center, UC Davis)

Statistics Seminar: STA 290

Tuesday, December 4th, 2012 at 4.10pm, MSB 1147 (Colloquium Room)
Refreshments 3:30pm, prior to seminar in MSB 4110 (Statistics Lounge)

Speaker: Bruce Rannala Genome Center, UC Davis

Title: "Statistical analysis of pooled samples for whole-genome case-control associations: A study of lung cancer in Thailand using 660,000 single-nucleotide polymorphisms"

Abstract: Genome-wide association studies (GWASs) have been extensively applied in case-control studies aimed at identifying single nucleotide polymorphisms (SNPs) in the human genome that are linked to complex diseases including cancer. However, the newest chip-based assays that interrogate hundreds of thousands of SNPs and examine individual genotypes remain expensive with large-scale studies costing upwards of a million dollars and most GWAS studies are therefore being conducted in developed countries. GWAS studies of populations in the developing world may help to identify new variants associated with disease because many developing countries have unique population genetic compositions and environmental exposures that differ from developed countries. A promising strategy that allows geneticists to carry out cost-effective GWAS studies in developing countries is the pooled GWAS in which DNA samples from cases and controls are separately pooled and genotyped on single chips. Results are presented from a pooled GWAS aimed at identifying SNPs influencing lung cancer susceptibility in the population of Northern Thailand (which has a high incidence of lung cancer relative to other regions). Our study used the Illumina Infinium Human660W Quad BeadChip which queries approximately 550,000 SNPs identified by the Human HapMap project. I will describe some of the statistics used for comparing SNP allele frequencies in cases versus controls as well as methods for eliminating chip- or pooling-based artifacts. About a dozen "SNPs of interest" were identified that appeared to differ in frequency between cases and controls with p-values ranging from 10^(-3) to 10^(-8). Several of the identified SNPs have been previously associated with cancers or are linked to genes with a known role in cancer. The total cost of this study was orders of magnitude less than an individual genotype-based study would have cost (roughly $10,000 USD for the genomic component versus a projected individual genotype study cost of about $350,000 USD).