Date of Award

12-15-2014

Document Type

Open Access Dissertation

Department

Statistics

First Advisor

Lianming Wang

Abstract

Survival analysis is a long-lasting and popular research area and has numerous applications in all fields such as social science, engineering, economics, industry, and public health. Interval-censored data are a special type of survival data, in which the survival time of interest is never exactly observed but is known to fall within some observed interval. Interval-censored data arise commonly in real-life studies, in which subjects are examined at periodical or irregular follow-up visits. In this dissertation, we develop efficient statistical approaches for regression analysis of bivariate intervalcensored data, in which the two survival times of interest are correlated and both have an interval-censored data structure. Chapter 1 first describes the structure of interval-censored data in detail, and four real-life data sets are presented for illustrations. A literature review is provided regarding the existing semiparametric regression models and methods on intervalcensored data. The last section of this chapter provides some important background knowledge to be used in later chapter of this dissertation, such as Kendall’s and Dirichlet process mixture model. Chapter 2 proposes a novel and fast EM algorithm for regression analysis of bivariate current status data based on the Gamma-frailty proportional hazards (PH) model. Monotone splines are adopted to approximate the unknown conditional baseline cumulative functions. A three-stage data augmentation is proposed and leads to a complete data likelihood in a simple form. An EM algorithm is further derived utilizing this complete likelihood. The resulting algorithm is easy to implement, robust to initialization, and enjoys quick convergence. The proposed method has excellent performance in estimating the regression parameters, the baseline survival functions, and the statistical association between the both failure times through simulation studies. The method is also robust to the misspecifications of frailty distribution. Moreover, the method is much faster than existing approaches in the literature. Our method is illustrated by a real-life application about the prevalence of antibodies to hepatitis B and HIV among Irish prisoners. In Chapter 3, I revisit the topic on bivariate current status data but from a Bayesian perspective. Two Bayesian methods are proposed: one for Gamma-frailty PH model and one for frailty PH model with unknown frailty distribution. A Dirichlet process Gamma mixture model is proposed for modeling the unknown frailty distribution. Efficient Gibbs samplers are proposed for these two models. Simulation results suggest that both of the two proposed methods work well in the cases of correctly specified and misspecified frailty distributions. The method based on the Gamma-frailty PH model is preferred because of its simpler model structure and robust performance in addition to providing Kendall’s in closed form. Chapter 4 investigates Bayesian regression analysis of bivariate interval-censored data. First, an efficient method is proposed based on the Gamma-frailty PH model, and simulation studies show that the proposed method works well when the model is correctly specified. It is also observed that the method leads to biased estimates when the two failure times are independent or weakly correlated. To handle both dependent and independent cases, a mixture of gamma and point mass at one is proposed for the frailty distribution. An efficient Gibbs sampler is proposed and is shown to have good performance in both cases through simulation studies. A read-life data set from an AIDs clinical trial is analyzed for illustration.

Share

COinS