Information Recovery With Missing Data When Outcomes Are Right Censored.

Steingrimsson, Jon

Information Recovery With Missing Data When Outcomes Are Right Censored.

Files

jas757.pdf (1.06 MB)

Permanent Link(s)

https://hdl.handle.net/1813/41100

Collections

Cornell Theses and Dissertations

Full item page

Author(s)

Steingrimsson, Jon

Abstract

This dissertation focuses on utilizing information more efficiently in several settings when some observations are right-censored using the semiparametric efficiency theory developed in Robins et al. (1994). Chapter 2 focuses on estimation of the regression parameter in the semiparametric accelerated failure time model when the data is collected using a case-cohort design. The previously proposed methods of estimation use some form of HorvitzThompsons estimators which are known to be inefficient and the main aim of Chapter 2 is to improve efficiency of estimation of the regression parameter for the accelerated failure time model for case-cohort studies. We derive the semiparametric information bound and propose a more practical class of augmented estimators motivated by the augmentation theory developed in Robins et al. (1994). We develop large sample properties, identify the most efficient estimator within the class of augmented estimators, and give practical guidance on how to calculate the estimator. Regression trees are non-parametric methods that use reduction in loss to partition the covariate space into binary partitions creating a prediction model that is easily interpreted and visualized. When some observations are censored the full data loss function is not a function of the observed data and Molinaro et al. (2004) used inverse probability weighted estimators to extend the loss functions to right-censored outcomes. Motivated by semiparametric efficiency theory Chapter 3 extends the approach in Molinaro et al. (2004) by using doubly robust loss function that utilize information on censored observations better in addition to being more robust to the modeling choices that need to be made. Regression trees are known to suffer from instability with minor changes in the data sometimes resulting in very different trees. Ensemble based methods that average several trees have been shown to lead to prediction models that usually have smaller prediction error. One such ensemble based method is random forests Breiman (2001) and in Chapter 4 we use the regression tree methodology developed in Chapter 3 as building blocks to random forests.

Date Issued

2015-08-17

Keywords

Missing Data; Semiparametric Theory; Censored Data

Committee Chair

Hooker,Giles J.

Committee Co-Chair

Strawderman,Robert Lee

Committee Member

Wells,Martin Timothy
Ruppert,David

Degree Discipline

Statistics

Degree Name

Ph. D., Statistics

Degree Level

Doctor of Philosophy

Types

dissertation or thesis

Information Recovery With Missing Data When Outcomes Are Right Censored.

Files

No Access Until

Permanent Link(s)

Collections

Other Titles

Author(s)

Abstract

Journal / Series

Volume & Issue

Description

Sponsorship

Date Issued

Publisher

Keywords

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record