Statisticians have met the need to test hundreds or thousands of genomics hypotheses simultaneously with novel empirical Bayes methods that combine advantages of traditional Bayesian and frequentist statistics. Techniques for estimating the local false discovery rate assign probabilities of differential gene expression, genetic association, etc. without requiring subjective prior distributions. This book brings these methods to scientists while keeping the mathematics at an elementary level. Readers will learn the fundamental concepts behind local false discovery rates, preparing them to analyze their own genomics data and to critically evaluate published genomics research.
* dice games and exercises, including one using interactive software, for teaching the concepts in the classroom * examples focusing on gene expression and on genetic association data and briefly covering metabolomics data and proteomics data * gradual introduction to the mathematical equations needed * how to choose between different methods of multiple hypothesis testing * how to convert the output of genomics hypothesis testing software to estimates of local false discovery rates * guidance through the minefield of current criticisms of p values * material on non-Bayesian prior p values and posterior p values not previously published
David R. Bickel (University of Ottawa Ottawa Canada)
Country of Publication:
18 September 2019
1. Basic probability and statistics Biological background Probability distributions Probability functions Contingency tables Hypothesis tests and p values Bibliographical notes Exercises (PS1-PS3) 2. Introduction to likelihood Likelihood function defined Odds and probability: What's the difference? Bayesian uses of likelihood Bibliographical notes Exercises (L1-L3) 3. False discovery rates Introduction Local false discovery rate Global and local false discovery rates Computing the LFDR estimate Bibliographical notes Exercises (L4; A-B) 4. Simulating and analyzing gene expression data Simulating gene expression with dice DE games Effects and Estimates (E&E) Under the hood: normal distributions Bibliographical notes Exercises (C-E; G1-G4) 5. Variations in dimension and data Introduction High-dimensional genetics Subclasses and superclasses Medium number of features Bibliographical notes Exercise (G5) 6. Correcting bias in estimates of the false discovery rate Why correct the bias in estimates of the false discovery rate? A misleading estimator of the false discovery rate 66 Corrected and re-ranked estimators of the local false discovery rate Application to gene expression data analysis Bibliographical notes Exercises (CFDR0-CFDR3) 7. The L value: An estimated local false discovery rate to replace a p value What if I only have one p value? Am I doomed? The L value to the rescue! The multiple-test L value Bibliographical notes Exercises (LV1-LV9) 8. Maximum likelihood and applications Non-Bayesian uses of likelihood Empirical Bayes uses of likelihood Bibliographical notes Exercises (M1-M2) Appendix A. Generalized Bonferroni correction derived from conditional compatibility A non-Bayesian approach to testing single and multiple hypotheses Bibliographical notes Appendix B. How to choose a method of hypothesis testing Guidelines for scientists performing statistical hypothesis tests Bibliographical notes Appendix. Bibliography
David R. Bickel is an Associate Professor in the Department of Biochemistry, Microbiology and Immunology of the University of Ottawa and a Core Member of the Ottawa Institute of Systems Biology. Since 2011, he has been teaching classes focused on the statistical analysis of genomics data. While working as a biostatistician in academia and industry, he has published new statistical methods for analyzing genomics data in leading statistics and bioinformatics journals. He is also investigating the foundations of statistical inference. For recent activity, see davidbickel.com or follow him at @DavidRBickel (Twitter).