American Statistical Association
Modern biological data sets often consist of large numbers of predictors and small samples sizes. This fact has generated a vast literature on statistical methods for the so-called large p small n problem. Examples include gene expression data from micro-arrays, mass spectrometry (proteomic) data, and association studies involving a large number of genetic markers and a given phenotype. In this talk, I will discuss the use of three simple modeling and computational strategies for such problems: (1) random effects to induce shrinkage and for model parsimony; (2) mixtures for (empirical) Bayes prediction and classification; and (3) computation via the EM algorithm or MCMC, with the main focus being on a model we have developed for analyzing expression micro-arrays.
|Date:||Wednesday, June 2, 2010|
|Time:||4:00 - 5:00 P.M.|
Memorial Sloan-Kettering Cancer Center
Department of Epidemiology and Biostatistics
307 East 63rd Street
(between First and Second Avenues)
New York, New York
Note: To gain access to the building, please follow the directions by the telephone in the foyer.