Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Gaussian copula for mixed data with missing values: model estimation and imputation

Gaussian copula for mixed data with missing values: model estimation and imputation

File(s)
Zhao_cornellgrad_0058F_13067.pdf (1.86 MB)
Permanent Link(s)
https://doi.org/10.7298/611b-8239
https://hdl.handle.net/1813/111829
Collections
Cornell Theses and Dissertations
Author
Zhao, Yuxuan
Abstract

Missing data imputation forms the first critical step of many data analysis pipelines. For practical applications, imputation algorithms should produce imputations that match the true data distribution and handle data of mixed types. This dissertation develops new imputation algorithms for data with many different variable types, including continuous, binary, ordinal, and truncated and categorical values, by modeling data as samples from a Gaussian copula model. This semiparametric model learns the marginal distribution of each variable to match the empirical distribution, yet describes the interactions between variables with a joint Gaussian that enables fast inference, imputation with confidence intervals, and multiple imputation. This dissertation also develops specialized extensions to handle large datasets (with complexity linear in the number of observations) and streaming datasets (with online imputation).

Description
188 pages
Date Issued
2022-05
Keywords
Gaussian copula
•
imputation
•
missing data
•
mixed data
•
ordinal data
Committee Chair
Udell, Madeleine Richards
Committee Member
Joachims, Thorsten
Ning, Yang
Degree Discipline
Statistics
Degree Name
Ph. D., Statistics
Degree Level
Doctor of Philosophy
Rights
Attribution-NonCommercial-ShareAlike 4.0 International
Rights URI
https://creativecommons.org/licenses/by-nc-sa/4.0/
Type
dissertation or thesis
Link(s) to Catalog Record
https://newcatalog.library.cornell.edu/catalog/15530016

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance