Estimation And Inference Of Random Effect Models With Applications To Population Genetics And Proteomics
In this dissertation I present two methodologies for estimation and inference of random effect models with applications to population genetics and proteomics. The first methodology presented, SnIPRE, is designed for identifying genes under natural selection. SnIPRE is a "McDonald-Kreitman" type of analysis, in that it is based on MK table data and has an advantage over other types of statistics because it is robust to demography. Similar to the MKprf method, SnIPRE makes use of genome-wide information to increase power, but is nonparametric in the sense that it makes no assumptions (and does not require estimation) of parameters such as mutation rate and species divergence time in order to identify genes under selection. In simulations SnIPRE outperforms both the MK statistic and the two versions of MKprf considered. With the right assumptions SnIPRE may be used to estimate population parameters, and in chapter 3 we discuss the robustness of the method to the assumption of independent sites. I also propose a procedure for more precise estimation of the confidence bounds of the selection effect, and then apply our method to Drosophila and human-chimp comparison data. PROWLRE, an empirical Bayes method for analyzing shotgun-proteomics data, is introduced in the final chapter. While a fully Bayesian implementation of this model is straightforward, the empirical Bayes implementation is more challenging. I present an EM algorithm designed for fitting this latent variable model and then compare the results to the Bayesian estimation on simulated and synthetic data.
Mixed Effect Models; Natural Selection; Proteomics
Bustamante, Carlos D.
Booth, James; Wells, Martin Timothy
Ph. D., Statistics
Doctor of Philosophy
dissertation or thesis