LATENT GAUSSIAN COPULA MODEL FOR HIGH DIMENSIONAL MIXED DATA, AND ITS APPLICATIONS

Other Titles
Abstract
Due to the advent of “big data” technologies, mixed data that consist of both categorical and continuous variables are encountered in many application areas. We present a framework to estimate the correlation among variables of mixed data types via a rank-based approach under a latent Gaussian copula model. Theoretical properties of the correlation matrix estimator are also established. With the correlation matrix estimate Σ , we are able to further extend the topic to other problems, such as graphical models, regression, and classification. In particular, we propose a family of methods for prediction with high dimensional mixed data that involves a shrunken estimate of the inverse matrix of Σ. By maximizing the log likelihood of the data subject to a penalty on the elements of the inverse of Σ, we demonstrate that higher prediction accuracy can be achieved, compared to other popular existing methods. We also show that several existing methods are special cases of the family. In addition, we consider the classification problem via a covariance-based approach analogous to linear discriminant analysis.
Journal / Series
Volume & Issue
Description
123 pages
Sponsorship
Date Issued
2020-05
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Booth, James
Committee Co-Chair
Committee Member
Wells, Martin
Ning, Yang
Degree Discipline
Statistics
Degree Name
Ph. D., Statistics
Degree Level
Doctor of Philosophy
Related Version
Related DOI
Related To
Related Part
Based on Related Item
Has Other Format(s)
Part of Related Item
Related To
Related Publication(s)
Link(s) to Related Publication(s)
References
Link(s) to Reference(s)
Previously Published As
Government Document
ISBN
ISMN
ISSN
Other Identifiers
Rights
Rights URI
Types
dissertation or thesis
Accessibility Feature
Accessibility Hazard
Accessibility Summary
Link(s) to Catalog Record