Design-based efficiency for analyzing cluster-randomized experiments

Date

2020-08-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Cluster randomized experiments (CREs) have three defining features: (i) treatments are randomized to clusters, or groups of units, rather than units themselves, (ii) clusters are formed a priori to experimentation and without researcher intervention, and (iii) the research objective and analysis is still centered on units. CREs are common, particularly for intervention studies in public health and political studies. Yet, despite their growing popularity, there is still ongoing debate, even among the experts, on their analysis and design methodologies. We center our focus on design-based estimators of the population average treatment effect (PATE) and the standard error (SE) under Neyman-Rubin's potential outcomes framework.

The inherent disparity between the experimental and observational units in CREs can lead to some analytical and design challenges---for example, bias, large variability, and/or lack of location invariance. Moreover, randomizing treatments to clusters is known to be less efficient than randomizing to individual units. Conventionally, clusters in CREs are sampled using simple random sampling. Stratifying or matching clusters into pairs based on important covariates can improve precision on estimation.

We instead propose a different sampling scheme: sampling with probability proportional to size without replacement. This modification leads to a new estimator of PATE that can accommodate the clustering structure in CREs without having to compromise on desirable statistical properties. We then derive a conservative estimator for the variance of our estimator. We also synthesize the myriad perspectives on designing CREs and produce recommendations on the best design practices. Finally, we introduce our R package analyzeCRE that implements the theoretical work in this dissertation and provide a guide on how to execute the functions for analyzing and designing CREs.

Description

Keywords

Cluster-randomized experiments, Probability-proportional-to-size sampling, Neyman-Rubin causal model, Potential outcomes, Population average treatment effect

Graduation Month

August

Degree

Doctor of Philosophy

Department

Department of Statistics

Major Professor

Michael J. Higgins

Date

2020

Type

Dissertation

Citation