On goodness-of-fit of logistic regression model



Journal Title

Journal ISSN

Volume Title


Kansas State University


Logistic regression model is a branch of the generalized linear models and is widely used in many areas of scientific research. The logit link function and the binary dependent variable of interest make the logistic regression model distinct from linear regression model. The conclusion drawn from a fitted logistic regression model could be incorrect or misleading when the covariates can not explain and /or predict the response variable accurately based on the fitted model- that is, lack-of-fit is present in the fitted logistic regression model. The current goodness-of-fit tests can be roughly categorized into four types. (1) The tests are based on covariate patterns, e.g., Pearson's Chi-square test, Deviance D test, and Osius and Rojek's normal approximation test. (2) Hosmer-Lemeshow's C and Hosmer-Lemeshow's H tests are based on the estimated probabilities. (3) Score tests are based on the comparison of two models, where the assumed logistic regression model is embedded into a more general parametric family of models, e.g., Stukel's Score test and Tsiatis's test. (4) Smoothed residual tests include le Cessie and van Howelingen's test and Hosmer and Lemeshow's test. All of them have advantages and disadvantages. In this dissertation, we proposed a partition logistic regression model which can be viewed as a generalized logistic regression model, since it includes the logistic regression model as a special case. This partition model is used to construct goodness-of- fit test for a logistic regression model which can also identify the nature of lack-of-fit is due to the tail or middle part of the probabilities of success. Several simulation results showed that the proposed test performs as well as or better than many of the known tests.



Logistic Regression, Goodness-of-Fit

Graduation Month



Doctor of Philosophy


Department of Statistics

Major Professor

Shie-Shien Yang