Design-based efficiency for analyzing cluster-randomized experiments

dc.contributor.authorXiong, Yeng
dc.date.accessioned2020-08-13T14:52:26Z
dc.date.available2020-08-13T14:52:26Z
dc.date.graduationmonthAugust
dc.date.issued2020-08-01
dc.description.abstractCluster randomized experiments (CREs) have three defining features: (i) treatments are randomized to clusters, or groups of units, rather than units themselves, (ii) clusters are formed a priori to experimentation and without researcher intervention, and (iii) the research objective and analysis is still centered on units. CREs are common, particularly for intervention studies in public health and political studies. Yet, despite their growing popularity, there is still ongoing debate, even among the experts, on their analysis and design methodologies. We center our focus on design-based estimators of the population average treatment effect (PATE) and the standard error (SE) under Neyman-Rubin's potential outcomes framework. The inherent disparity between the experimental and observational units in CREs can lead to some analytical and design challenges---for example, bias, large variability, and/or lack of location invariance. Moreover, randomizing treatments to clusters is known to be less efficient than randomizing to individual units. Conventionally, clusters in CREs are sampled using simple random sampling. Stratifying or matching clusters into pairs based on important covariates can improve precision on estimation. We instead propose a different sampling scheme: sampling with probability proportional to size without replacement. This modification leads to a new estimator of PATE that can accommodate the clustering structure in CREs without having to compromise on desirable statistical properties. We then derive a conservative estimator for the variance of our estimator. We also synthesize the myriad perspectives on designing CREs and produce recommendations on the best design practices. Finally, we introduce our R package analyzeCRE that implements the theoretical work in this dissertation and provide a guide on how to execute the functions for analyzing and designing CREs.
dc.description.advisorMichael J. Higgins
dc.description.degreeDoctor of Philosophy
dc.description.departmentDepartment of Statistics
dc.description.levelDoctoral
dc.identifier.urihttps://hdl.handle.net/2097/40819
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectCluster-randomized experiments
dc.subjectProbability-proportional-to-size sampling
dc.subjectNeyman-Rubin causal model
dc.subjectPotential outcomes
dc.subjectPopulation average treatment effect
dc.titleDesign-based efficiency for analyzing cluster-randomized experiments
dc.typeDissertation

Files

Original bundle

Now showing 1 - 2 of 2
No Thumbnail Available
Name:
anaylzeCREfunctions.R
Size:
34.09 KB
Format:
Unknown data format
Description:
R functions
Loading...
Thumbnail Image
Name:
YengXiong2020.pdf
Size:
761 KB
Format:
Adobe Portable Document Format
Description:
Dissertation

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: