Author

Tuan Quoc Do

Date of Award

Spring 2020

Document Type

Open Access Dissertation

Department

Statistics

First Advisor

Karl Gregory

Second Advisor

Lianming Wang

Abstract

This document is composed of three main chapters. In the first chapter, we study the mixture of experts, a powerful machine learning model in which each expert handles a different region of the covariate space. However, it is crucial to choose an appropriate number of experts to avoid overfitting or underfitting. A group fused lasso (GFL) term is added to the model with the goal of making the coefficients of the experts and the gating network closer together. An algorithm to optimize the problem is also developed using block-wise coordinate descent in the dual counterpart. Numerical results on simulated and real world datasets show that the penalized model outperforms the unpenalized one and performs on par with many well-known models.

The second chapter studies GFL on its own and methods to solve it efficiently. In GFL, the response and the coefficient of each observation are not scalars but vectors. Thus, many fast solvers of the fused lasso cannot be applied to the GFL. Two algorithms are proposed to solve the GFL, namely Alternating Minimization and Dual Path. Results from speed trial show that our algorithms are competitive compared to other existing methods.

The third chapter proposes a better alternative to the Box-Cox transformation, a popular method to transform the response variable to have an approximately normal distribution in many cases. The Box-Cox transformation is widely applied in regression, ANOVA and machine learning for both complete and censored data. However, since it is parametric, it can be too restrictive in many cases. Our proposed method is nonparametric, more flexible and can be fitted efficiently by our novel EM algorithms which accommodate both complete and right-censored data.

Share

COinS