Randomization test and correlation effects in high dimensional data

Date

2012-07-17

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

High-dimensional data (HDD) have been encountered in many fields and are characterized by a “large p, small n” paradigm that arises in genomic, lipidomic, and proteomic studies. This report used a simulation study that employed basic block diagonal covariance matrices to generate correlated HDD. Quantities of interests in such data are, among others, the number of ‘significant’ discoveries. This number can be highly variable when data are correlated. This project compared randomization tests versus usual t-tests for testing of significant effects across two treatment conditions. Of interest was whether the variance of the number of discoveries is better controlled in a randomization setting versus a t-test. The results showed that the randomization tests produced results similar to that of t-tests.

Description

Keywords

Randomization test, Correlation effect, High dimensional data

Graduation Month

August

Degree

Master of Science

Department

Department of Statistics

Major Professor

Gary Gadbury

Date

2012

Type

Report

Citation