|
American Statistical Association
|
Modern biological data sets often consist of large numbers of predictors and small samples sizes. This fact has generated a vast literature on statistical methods for the so-called large p small n problem. Examples include gene expression data from micro-arrays, mass spectrometry (proteomic) data, and association studies involving a large number of genetic markers and a given phenotype. In this talk, I will discuss the use of three simple modeling and computational strategies for such problems: (1) random effects to induce shrinkage and for model parsimony; (2) mixtures for (empirical) Bayes prediction and classification; and (3) computation via the EM algorithm or MCMC, with the main focus being on a model we have developed for analyzing expression micro-arrays.
| Date: | Wednesday, June 2, 2010 |
|---|---|
| Time: | 4:00 - 5:00 P.M. |
| Location: |
Memorial Sloan-Kettering Cancer Center
Department of Epidemiology and Biostatistics 307 East 63rd Street (between First and Second Avenues) Room 331 New York, New York Note: To gain access to the building, please follow the directions by the telephone in the foyer. |