R[superscript]2 statistics with application to association mapping

K-REx Repository

Show simple item record

dc.contributor.author Sun, Guannan
dc.date.accessioned 2008-05-15T15:15:37Z
dc.date.available 2008-05-15T15:15:37Z
dc.date.issued 2008-05-15T15:15:37Z
dc.identifier.uri http://hdl.handle.net/2097/774
dc.description.abstract In fitting linear models, R[superscript]2 statistic has been wildly used as one of the measures to assess the goodness-of-fit and prediction power of the model. Unlike fixed linear models, at this time there is no single universally accepted measure for assessing goodness-of-fit and prediction power of a linear mixed model. In this report, we reviewed seven different approaches proposed to define a measure analogous to the usual R[superscript]2 statistic for assessing mixed models. One of seven statistics,Rc, has both conditional and marginal versions. Association mapping is an efficient way to link the genotype data with the phenotype diversity. When applying the R[superscript]2 statistic to the association mapping application, it can determine how well genetic polymorphisms, which are the explanatory variables in the mixed models, explain the phenotypic variation, which is the dependent variation. A linear mixed model method recently has been developed to control the spurious associations due to population structure and relative kinship among individuals of an association mapping. We assess seven definitions of R[superscript]2 statistic for the linear mixed model using data from two empirical association mapping samples: a sample with 277 diverse maize inbred lines and a global sample of 95 Arabidopsis thaliana accessions using the new method. R[superscript]2[subscript]LR statistic derived from the log-likelihood principle follows all the criterions of R[superscript]2 statistic and can be used to understand the overlap between population structure and relative kinship in controlling for sample relatedness. From our results,R[superscript]2[subscript]LR statistic is an appropriate R[superscript]2 statistic for comparing models with different fixed and random variables. Therefore, we recommend using RLR statistic for linear mixed models in association mapping. en
dc.description.sponsorship National Research Initiative Plant Genome Program of the USDA-CSREES en
dc.language.iso en_US en
dc.publisher Kansas State University en
dc.subject R[superscript]2 Statistic en
dc.subject goodness-of-fit en
dc.subject mixed effect model en
dc.subject association mapping en
dc.title R[superscript]2 statistics with application to association mapping en
dc.type Report en
dc.description.degree Master of Science en
dc.description.level Masters en
dc.description.department Department of Statistics en
dc.description.advisor Shie-Shien Yang en
dc.subject.umi Agriculture, Agronomy (0285) en
dc.subject.umi Statistics (0463) en
dc.date.published 2008 en
dc.date.graduationmonth August en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


Advanced Search

Browse

My Account

Statistics








Center for the

Advancement of Digital

Scholarship

118 Hale Library

Manhattan KS 66506


(785) 532-7444

cads@k-state.edu