Date of Award

Spring 2021

Document Type

Open Access Dissertation

Department

Statistics

First Advisor

John Grego

Second Advisor

James Lynch

Abstract

Functional data analysis (FDA) experienced a burst of growth after Ramsay and Silverman published their textbook in 1997. Functional data analysis interests researchers because of the challenges it adds to well-established multivariate analysis. Unlike finite dimensional random vectors, we visualize infinite dimensional random functions; for example, curves, images, brain scans, etc. A vast amount of literature have been dedicated to developing models for functional data. The ideas are mostly based on basis function representations and kernel-based nonparametric methods. In this dissertation, we propose a Bayesian treatment of nonparametric functional data analysis by introducing a Gaussian process (GP) over the space of functions. GP priors have been studied for Bayesian nonparametric regression and have achieved popularity in the machine learning community. A Gaussian process provides a useful prior for the underlying mean function of the functional data. The structure of the stochastic process allows for combining data for final analysis from all observed time (or other domain) points. This is especially useful for sparse functional data where observations from different subjects might come from different (sparse) time grids. We propose a unified Bayesian nonparametric framework based on a GP prior for both sparse and non-sparse settings. The resulting GP functional data model can efficiently produce intuitive results for solving smoothing, prediction, regression and classification, especially for sparse functional data which is more challenging to handle with the usual approaches. We derive the analytical expressions for the posterior distribution and posterior predictive distribution. An efficient computation algorithm is also presented to speed up the model estimation and prediction for regular functional data. A classification algorithm based on the Bayes classifier has been developed for predicting the group/population the functional data belong. We demonstrate the performance of our proposed model via simulation studies and application on three datasets. We model spinal bone mineral density, temporal gene expression and tree growth using the Gaussian process based functional data model. For modeling tree growth, we extend the proposed model to include random effects. Supervised classification is performed for all the datasets using an approach based on the proposed model. The possibility of extending the model to function-on-scalar regression by including covariate information in the mean function is discussed as well.

Share

COinS