Design-based efficiency for analyzing cluster-randomized experiments

Xiong, Yeng

Design-based efficiency for analyzing cluster-randomized experiments

Files

anaylzeCREfunctions.R (34.09 KB)

YengXiong2020.pdf (761 KB)

Date

2020-08-01

Authors

Xiong, Yeng

Publisher

Kansas State University

Abstract

Cluster randomized experiments (CREs) have three defining features: (i) treatments are randomized to clusters, or groups of units, rather than units themselves, (ii) clusters are formed a priori to experimentation and without researcher intervention, and (iii) the research objective and analysis is still centered on units. CREs are common, particularly for intervention studies in public health and political studies. Yet, despite their growing popularity, there is still ongoing debate, even among the experts, on their analysis and design methodologies. We center our focus on design-based estimators of the population average treatment effect (PATE) and the standard error (SE) under Neyman-Rubin's potential outcomes framework.

The inherent disparity between the experimental and observational units in CREs can lead to some analytical and design challenges---for example, bias, large variability, and/or lack of location invariance. Moreover, randomizing treatments to clusters is known to be less efficient than randomizing to individual units. Conventionally, clusters in CREs are sampled using simple random sampling. Stratifying or matching clusters into pairs based on important covariates can improve precision on estimation.

We instead propose a different sampling scheme: sampling with probability proportional to size without replacement. This modification leads to a new estimator of PATE that can accommodate the clustering structure in CREs without having to compromise on desirable statistical properties. We then derive a conservative estimator for the variance of our estimator. We also synthesize the myriad perspectives on designing CREs and produce recommendations on the best design practices. Finally, we introduce our R package analyzeCRE that implements the theoretical work in this dissertation and provide a guide on how to execute the functions for analyzing and designing CREs.

Keywords

Cluster-randomized experiments, Probability-proportional-to-size sampling, Neyman-Rubin causal model, Potential outcomes, Population average treatment effect

Graduation Month

August

Degree

Doctor of Philosophy

Department

Department of Statistics

Major Professor

Michael J. Higgins

Type

Dissertation

URI

https://hdl.handle.net/2097/40819

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -

Full item page

Design-based efficiency for analyzing cluster-randomized experiments

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Graduation Month

Degree

Department

Major Professor

Date

Type

Citation

URI

Collections