American Statistical Association
New York City
Metropolitan Area Chapter

Memorial Sloan-Kettering Cancer Center
Biostatistics Seminar

Li-Xuan Qin
Department of Epidemiology & Biostatistics
Memorial Sloan-Kettering Cancer Center


Background: MicroRNA microarrays possess a number of unique data features that challenge the assumption key to existing normalization methods. They need to be re-assessed using genuine benchmark datasets that realistically represent data characteristics of microRNA arrays.

Methods: We developed a blocked randomization design for Agilent microRNA arrays, and applied it to generate a benchmark dataset free of confounding batch effects comparing endometrial and ovarian tumors. The benchmark dataset was assessed for differential expression and treated as the gold standard. We used the same tumor samples to generate a test dataset allowing for batch effects. After normalization, the test dataset was assessed for differential expression and compared with the gold standard. In addition to an empirical evaluation, we simulated data using the test data to mimic a range of differential expression patterns with various amounts and levels of asymmetry of differential expression, and further assessed the performance of normalization methods.

Results: We observed moderate and asymmetric differential expression between endometrial and ovarian tumors in the benchmark dataset. Array effects were observed in the test data and resulted in a true positive rate of 53% and a false discovery rate of 90%. Normalization are useful in increasing the number of true positive markers identified but still possess a large number of false positive markers with a false discovery rate as high as 55%. We observed similar results in our simulated datasets.

Conclusions: Our study demonstrated the utility of randomization and blocking in a large tumor microarray study and underlines their important benefits in accurate detection of disease relevant markers. Proper randomization and blocking should be adopted in microarray studies to the extent possible. Our paired array datasets provides an objective and realistic evaluation of normalization methods for miRNA arrays, and it shows that current normalization methods are useful in increasing the number of true positive markers identified but still possess a large number of false positive markers. Research is warranted to develop more efficient methods for normalization when it is needed.

Date: Wednesday, June 19, 2013
Time: 11:00 A.M. - 12:00 P.M.
Location: Memorial Sloan-Kettering Cancer Center
Department of Epidemiology and Biostatistics
307 East 63rd Street
(between First and Second Avenues)
3rd Floor Conference Room
New York, New York
Note: To gain access to the building, please follow the directions by the telephone in the foyer.


The International Year of Statistics (Statistics2013)
Home Page | Chapter News | Chapter Officers | Chapter Events
Other Metro Area Events | ASA National Home Page | Links To Other Websites
NYC ASA Chapter Constitution | NYC ASA Chapter By-Laws

Page last modified on June 14, 2013

Copyright © 1998-2013 by New York City Metropolitan Area Chapter of the ASA
Designed and maintained by Cynthia Scherer
Send questions or comments to