American Statistical Association
Consider a linear regression model where both p and n are large but p > n. The coefficient vector is unknown but is sparse in the sense that only a small proportion of its coordinates is nonzero, and we are interested in identifying these nonzero ones. We model the coordinates of coefficients as samples from a two-component mixture.
We propose a two-stage variable selection procedure which we call the UPS. This is a Screen and Clean procedure, in which we screen with the Univariate thresholding, and clean with the Penalized MLE. In many situations, the UPS possesses two important properties: Sure Screening and Separable After Screening (SAS). These properties enable us to reduce the original regression problem to many small-size regression problems that can be fitted separately. As a result, the UPS is effective both in theory and in computation.
Dr. Jin graduated in 2003 in Statistics, from Stanford, joined Statistics at Purdue University in 2003, and moved to Statistics at Carnegie Mellon University in 2007. Main interests: large-scale multiple testing, high dimensional classification, variable selection, clustering, graph theory, genomics, cosmology and astronomy.
|Date:||Thursday, March 10, 2011|
|Time:||4:00 - 5:00 P.M.|
Mailman School of Public Health
Department of Biostatistics
722 West 168th Street
Biostatistics Computer Lab
6th Floor - Room 656
New York, New York