Date of Award

4-30-2025

Document Type

Open Access Dissertation

Department

Statistics

First Advisor

Ray Bai

Abstract

The field of deep learning (DL) has received considerable attention in recent years. Thanks to rapid growth in computational power, the ability to collect massive datasets, and improvements in software and algorithms, DL is now routinely applied to areas as diverse as computer vision, natural language processing, and bioinformatics. At the same time, DL is only starting to be explored in the context of classical statistical inference problems such as bootstrapping, quantile regression, and mixture modeling. In this dissertation, we develop new DL methodology for three classical statistical problems: 1) weighted M-estimation, 2) joint quantile regression, and 3) mixing density estimation.

In Chapter 2, we introduce a deep learning generative framework for efficient weighted M-estimation. To overcome computational bottlenecks of various data perturbation procedures such as the bootstrap and cross-validation, we propose the Generative Multi-purpose Sampler (GMS), which directly constructs a generator function to produce solutions of weighted M-estimators from a set of given weights and tuning parameters. The GMS is implemented by a single optimization procedure without having to repeatedly evaluate the minimizers of weighted losses, and is thus capable of significantly reducing the computational time. We demonstrate that the GMS framework enables the implementation of various statistical procedures that would be infeasible in a conventional framework, such as iterated bootstrap procedures and cross-validation for penalized likelihood. To construct a computationally efficient generator function, we also propose a novel form of neural network called the weight multiplicative multilayer perceptron to achieve fast convergence.

In Chapter 3, we introduce a deep learning generative model for joint quantile estimation called Penalized Generative Quantile Regression (PGQR). Our approach simultaneously generates samples from many random quantile levels, allowing us to infer the conditional distribution of a response variable given a set of covariates. Our method employs a novel variability penalty to avoid the problem of vanishing variability, or memorization, in deep generative models. Further, we introduce a new family of partial monotonic neural networks (PMNN) to circumvent the problem of crossing quantile curves. A major benefit of PGQR is that it can be fit using a single optimization, thus bypassing the need to repeatedly train the model at multiple quantile levels or use computationally expensive cross-validation to tune the penalty parameter.

In Chapter 4, we propose a deep generative process for mixing density estimation in latent variable models called Generative Bootstrapping for Nonparametric Maximum Likelihood Estimation (GB-NPMLE). GB-NPMLE rapidly produces NPMLE bootstrap estimates for estimating an unknown continuous mixture distribution. Traditional bootstrapping requires repeated evaluations on resampled data and is not scalable. On the other hand, GB-NPMLE requires only a single evaluation of a novel two-stage optimization algorithm. Our procedure accurately estimates continuous mixing densities with little computational cost even when there are a hundred thousand observations.

In Chapter 5, we introduce neural-g, a new neural network-based estimator for mixing density estimation. Neural-g uses a softmax output layer to ensure that the estimated prior is a valid probability density. Under default hyperparameters, we show that neural-g is very flexible and capable of capturing many unknown densities, including those with flat regions, heavy tails, and/or discontinuities. We provide theoretical justification for neural-g by establishing a new universal approximation theorem regarding the capability of neural networks with softmax output layers to learn arbitrary probability mass functions. To accelerate convergence of our numerical implementation, we utilize a weighted average gradient descent approach to update the network parameters. Finally, we extend neural-g to multivariate prior density estimation.

Rights

© 2025, Shijie Wang

Share

COinS