A robust test of homogeneity in zero-inflated models for count data

Date

2018-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Evaluating heterogeneity in the class of zero-inflated models has attracted considerable attention in the literature, where the heterogeneity refers to the instances of zero counts generated from two different sources. The mixture probability or the so-called mixing weight in the zero-inflated model is used to measure the extent of such heterogeneity in the population. Typically, the homogeneity tests are employed to examine the mixing weight at zero. Various testing procedures for homogeneity in zero-inflated models, such as score test and Wald test, have been well discussed and established in the literature. However, it is well known that these classical tests require the correct model specification in order to provide valid statistical inferences. In practice, the testing procedure could be performed under model misspecification, which could result in biased and invalid inferences. There are two common misspecifications in zero-inflated models, which are the incorrect specification of the baseline distribution and the misspecified mean function of the baseline distribution. As an empirical evidence, intensive simulation studies revealed that the empirical sizes of the homogeneity tests for zero-inflated models might behave extremely liberal and unstable under these misspecifications for both cross-sectional and correlated count data. We propose a robust score statistic to evaluate heterogeneity in cross-sectional zero-inflated data. Technically, the test is developed based on the Poisson-Gamma mixture model which provides a more general framework to incorporate various baseline distributions without specifying their associated mean function. The testing procedure is further extended to correlated count data. We develop a robust Wald test statistic for correlated count data with the use of working independence model assumption coupled with a sandwich estimator to adjust for any misspecification of the covariance structure in the data. The empirical performances of the proposed robust score test and Wald test are evaluated in simulation studies. It is worth to mention that the proposed Wald test can be implemented easily with minimal programming efforts in a routine statistical software such as SAS. Dental caries data from the Detroit Dental Health Project (DDHP) and Girl Scout data from Scouting Nutrition and Activity Program (SNAP) are used to illustrate the proposed methodologies.

Description

Keywords

Zero-inflated models, Homogeneity tests, Model misspecification, Huber sandwich estimator

Graduation Month

May

Degree

Doctor of Philosophy

Department

Department of Statistics

Major Professor

Wei-Wen Hsu

Date

2018

Type

Dissertation

Citation