LATENT GAUSSIAN COPULA MODEL FOR HIGH DIMENSIONAL MIXED DATA, AND ITS APPLICATIONS
No Access Until
Permanent Link(s)
Collections
Other Titles
Author(s)
Abstract
Due to the advent of “big data” technologies, mixed data that consist of both categorical and continuous variables are encountered in many application areas. We present a framework to estimate the correlation among variables of mixed data types via a rank-based approach under a latent Gaussian copula model. Theoretical properties of the correlation matrix estimator are also established. With the correlation matrix estimate Σ , we are able to further extend the topic to other problems, such as graphical models, regression, and classification. In particular, we propose a family of methods for prediction with high dimensional mixed data that involves a shrunken estimate of the inverse matrix of Σ. By maximizing the log likelihood of the data subject to a penalty on the elements of the inverse of Σ, we demonstrate that higher prediction accuracy can be achieved, compared to other popular existing methods. We also show that several existing methods are special cases of the family. In addition, we consider the classification problem via a covariance-based approach analogous to linear discriminant analysis.
Journal / Series
Volume & Issue
Description
Sponsorship
Date Issued
Publisher
Keywords
Location
Effective Date
Expiration Date
Sector
Employer
Union
Union Local
NAICS
Number of Workers
Committee Chair
Committee Co-Chair
Committee Member
Ning, Yang