Xichen Mou

Date of Award

Summer 2019

Document Type

Open Access Dissertation



First Advisor

Joshua M. Tebbs

Second Advisor

Dewei Wang


In epidemiological applications, individual specimens (e.g., blood, urine, etc.) are often pooled together to detect the presence of disease or to measure the concentration level of a specific biomarker. Due to the advantage of cost efficiency, pooled data are also seen in diverse areas such as genetics, animal ecology, and environmental science. With pooled data, individual observations are masked and new statistical methods are needed to estimate characteristics such as disease prevalence, the underlying density function of a biomarker, etc. We focus on three estimation problems for pooled data. Chapters 2 and 3 propose nonparametric estimators for the density function f(Y|X) of a biomarker’s concentration Y given a single covariate X. We consider two types of pooling strategies: random pooling in Chapter 2 and homogeneous pooling in Chapter 3. For both strategies, we derive asymptotic properties of density estimators and evaluate performance through numerical studies in a variety of settings. We further illustrate the proposed methods by applying them to a polyfluorochemical data set. In Chapter 4, we develop a method to estimate disease prevalence and diagnostic accuracy probabilities (sensitivity and specificity) simultaneously from two-stage hierarchical group testing data. Through theoretical calculation and simulation, our approach is shown to be more efficient than existing methods which utilize only pooled-level responses.