Date of Award

2010

Document Type

Campus Access Dissertation

Department

Statistics

First Advisor

Edsel Pena

Abstract

Advances in data generating technology, such as microarray technology, allow for hundreds or thousands of pairs of hypotheses to be tested simultaneously. In order to avoid erroneously rejecting an unpalatable number of null hypotheses, it is necessary to make use of a multiple testing method. Many of these methods, such as the well-known Benjamini and Hochber [1995] or Sidak [1967] procedures, make use of the P-values, or significance values, of the individual tests, and further assume that P-value statistics are independent and identically distributed according to a uniform distribution under the null hypotheses. However, if the parametric model for the random observables is misspecified, or the test statistics have discrete distributions, this uniformity condition will not be satisfied. In Chapter 2, a stochastic process framework is introduced that, with the aid of a uniform variate, admits P-value statistics to satisfy the uniformity condition even when test statistics have discrete distributions. This allows for nonparametric tests to be used to generate P-value statistics satisfying the uniformity condition. The resulting multiple testing procedures are therefore endowed with robustness properties. Simulation studies suggest that nonparametric randomized test P-values allow for multiple testing methods to perform better when the model for the observables is nonparametric or misspecified. In Chapter 3, a careful examination of a real microarray data set is provided which indicates that microarray data may not be normally distributed. Nonparametric randomized P-values and T-test P-values are more rigorously compared. It is argued that randomized Wilcoxon test P-values allow for more biologically meaningful rejected null hypotheses when applied to a multiple testing procedure. In Chapter 4, the stochastic process framework is extended to allow each decision process to borrow information across tests. These stochastic processes will then induce well-defined P-value statistics which still satisfy the uniformity and independence conditions. It is shown analytically and with simulation that these P-value statistics, when used in multiple testing procedures, tend to allow for more rejected null hypotheses than competing P-value statistics, and still allow for the relevant error rate to be controlled.

Rights

© 2010, Joshua Habiger

Share

COinS