Information Recovery With Missing Data When Outcomes Are Right Censored.

Steingrimsson, Jon

Information Recovery With Missing Data When Outcomes Are Right Censored.

dc.contributor.author	Steingrimsson, Jon
dc.contributor.chair	Hooker,Giles J.
dc.contributor.coChair	Strawderman,Robert Lee
dc.contributor.committeeMember	Wells,Martin Timothy
dc.contributor.committeeMember	Ruppert,David
dc.date.accessioned	2015-10-15T18:11:13Z
dc.date.available	2020-08-17T06:00:34Z
dc.date.issued	2015-08-17
dc.description.abstract	This dissertation focuses on utilizing information more efficiently in several settings when some observations are right-censored using the semiparametric efficiency theory developed in Robins et al. (1994). Chapter 2 focuses on estimation of the regression parameter in the semiparametric accelerated failure time model when the data is collected using a case-cohort design. The previously proposed methods of estimation use some form of HorvitzThompsons estimators which are known to be inefficient and the main aim of Chapter 2 is to improve efficiency of estimation of the regression parameter for the accelerated failure time model for case-cohort studies. We derive the semiparametric information bound and propose a more practical class of augmented estimators motivated by the augmentation theory developed in Robins et al. (1994). We develop large sample properties, identify the most efficient estimator within the class of augmented estimators, and give practical guidance on how to calculate the estimator. Regression trees are non-parametric methods that use reduction in loss to partition the covariate space into binary partitions creating a prediction model that is easily interpreted and visualized. When some observations are censored the full data loss function is not a function of the observed data and Molinaro et al. (2004) used inverse probability weighted estimators to extend the loss functions to right-censored outcomes. Motivated by semiparametric efficiency theory Chapter 3 extends the approach in Molinaro et al. (2004) by using doubly robust loss function that utilize information on censored observations better in addition to being more robust to the modeling choices that need to be made. Regression trees are known to suffer from instability with minor changes in the data sometimes resulting in very different trees. Ensemble based methods that average several trees have been shown to lead to prediction models that usually have smaller prediction error. One such ensemble based method is random forests Breiman (2001) and in Chapter 4 we use the regression tree methodology developed in Chapter 3 as building blocks to random forests.
dc.identifier.other	bibid: 9333149
dc.identifier.uri	https://hdl.handle.net/1813/41100
dc.language.iso	en_US
dc.subject	Missing Data
dc.subject	Semiparametric Theory
dc.subject	Censored Data
dc.title	Information Recovery With Missing Data When Outcomes Are Right Censored.
dc.type	dissertation or thesis
thesis.degree.discipline	Statistics
thesis.degree.grantor	Cornell University
thesis.degree.level	Doctor of Philosophy
thesis.degree.name	Ph. D., Statistics

Files

Original bundle

Now showing 1 - 1 of 1

Name:: jas757.pdf
Size:: 1.06 MB
Format:: Adobe Portable Document Format

Download

Collections

Cornell Theses and Dissertations