Date of Award
2017
Document Type
Open Access Dissertation
Department
Statistics
Sub-Department
Norman J. Arnold School of Public Health
First Advisor
James W. Hardin
Second Advisor
Bo Cai
Abstract
A commonly encountered data type in real life is count data, especially in selfreported behavioral studies. One issue of the self-reported count data is the inaccuracy. In the first part of the dissertation, we are going to address one specific type of inaccuracy in bivariate count data–heaping. Copula functions are used for the formulation of the bivariate distribution. Using copula functions for solving data inaccuracy problems is still a new area, which we are going to explore in this dissertation.
We also discuss the methods for variable selection when the explanatory variables are highly correlated. In particular, our method is based on the sparse Bayesian infinite factor models (Bhattacharya and Dunson, 2011). The classic Bayesian variable selection priors are integrated into the factor analysis method. The proposed method can accommodate both binary and continuous variables.
In the last part of this dissertation, we extend the Bayesian factor models into the nonparametric setting. As sometimes the normality assumption can be too strict for the data, or there are outliers that might affect the model performance, our proposed method relaxes the normality assumption, while simultaneously groups the correlated explanatory variables. Our proposed method is one of the first explorations of allowing nonparametric assumption for in a Bayesian factor analysis setting.
Rights
© 2017, Xinling Xu
Recommended Citation
Xu, X.(2017). Statistical Methods for Multivariate and Correlated Data. (Doctoral dissertation). Retrieved from https://scholarcommons.sc.edu/etd/4273