R[superscript]2 statistics with application to association mapping

Sun, Guannan

R[superscript]2 statistics with application to association mapping

Files

GuannanSun2008.pdf (729.54 KB)

Date

2008-05-15T15:15:37Z

Authors

Sun, Guannan

Publisher

Kansas State University

Abstract

In fitting linear models, R[superscript]2 statistic has been wildly used as one of the measures to assess the goodness-of-fit and prediction power of the model. Unlike fixed linear models, at this time there is no single universally accepted measure for assessing goodness-of-fit and prediction power of a linear mixed model. In this report, we reviewed seven different approaches proposed to define a measure analogous to the usual R[superscript]2 statistic for assessing mixed models. One of seven statistics,Rc, has both conditional and marginal versions. Association mapping is an efficient way to link the genotype data with the phenotype diversity. When applying the R[superscript]2 statistic to the association mapping application, it can determine how well genetic polymorphisms, which are the explanatory variables in the mixed models, explain the phenotypic variation, which is the dependent variation. A linear mixed model method recently has been developed to control the spurious associations due to population structure and relative kinship among individuals of an association mapping. We assess seven definitions of R[superscript]2 statistic for the linear mixed model using data from two empirical association mapping samples: a sample with 277 diverse maize inbred lines and a global sample of 95 Arabidopsis thaliana accessions using the new method. R[superscript]2[subscript]LR statistic derived from the log-likelihood principle follows all the criterions of R[superscript]2 statistic and can be used to understand the overlap between population structure and relative kinship in controlling for sample relatedness. From our results,R[superscript]2[subscript]LR statistic is an appropriate R[superscript]2 statistic for comparing models with different fixed and random variables. Therefore, we recommend using RLR statistic for linear mixed models in association mapping.

Keywords

R[superscript]2 Statistic, goodness-of-fit, mixed effect model, association mapping

Graduation Month

August

Degree

Master of Science

Department

Department of Statistics

Major Professor

Shie-Shien Yang

Date

2008

Type

Report

URI

http://hdl.handle.net/2097/774

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -

Full item page

R[superscript]2 statistics with application to association mapping

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Graduation Month

Degree

Department

Major Professor

Date

Type

Citation

URI

Collections