Methods For Multivariate Longitudinal Count And Duration Models With Applications In Economics
Quality and quantity of social science data is continually improving, from large publicuse survey microdata to private industry data. This wealth of data allows researchers to ask more complex questions about interdependencies of social and economic processes and behavior. This dissertation presents methods for models that address interdisciplinary research questions about the association structure of multiple outcomes of similar or disparate types, e.g. count and duration outcomes. The proposed models and methods address associations of multiple outcomes through correlated unobserved subject-specific effects. Chapter 2 presents a semiparametric method for estimating the marginal response and association parameters in a random effects multivariate longitudinal count model. In the context of the generalized estimating equations (GEE) framework, we use a specific form of the covariance matrix of the response vector based on a model that induces dependence over time and outcomes using random effects. This moment based method is robust to distributional misspecification and reduces the computational burden associated with a high-dimensional joint distribution by avoiding parametric assumptions on the response and unobserved effects. Through a simulation study we compare finite sample robustness properties of this semiparametric method with a pseudo-likelihood approach that imposes distributional assumptions. Both of these methods are then used to analyze a dataset of insurance claim counts for three types of coverage over time. The economic significance of these results is presented in Chapter 3. Chapter 4 presents a Gaussian variational approximation (GVA) approach for estimation of a joint multivariate longitudinal count and multivariate duration random effects model. GVA proposes an approximate posterior distribution of the random effects to obtain a closed form lower bound of the marginal likelihood. GVA estimators are obtained by maximizing the variational lower bound, which coincides with minimizing the Kullback-Leibler distance between the random effects posterior distribution and the assumed approximate posterior distribution. This approach circumvents the computationally complex, high-dimensional integral associated with the marginal distribution of a joint longitudinal and duration model. Through a simulation study we compare finite sample properties of the variational approximation approach with comparable univariate and multivariate two-stage plug-in approaches. These methods are then used to analyze a dataset of insurance claim counts and policy duration for three types of coverage over time.
Random Effects Models; Unobserved; Heterogeneity; Count Data; Duration; Data; Longitudinal Data Analysis.
Strawderman, Robert Lee; Booth, James
Ph.D. of Statistics
Doctor of Philosophy
dissertation or thesis