Mean-weighted case specific random forests for estimating causal effects

dc.contributor.authorAddae, Linus
dc.date.accessioned2021-08-10T18:40:40Z
dc.date.available2021-08-10T18:40:40Z
dc.date.graduationmonthAugusten_US
dc.date.issued2021-08-01
dc.date.published2021en_US
dc.description.abstractCausal inference is a branch of statistics that deals with determining how responses are affected by treatments. In this dissertation, we examine two problems in causal inference under the Neyman-Rubin causal model (NRCM): estimation of counterfactuals—hypothetical unobserved responses of units under different treatment conditions—and treatment effect estimation under treatment spillover—when the treatment status of one unit affects the response of another. First, we extend the case specific random forest (CSRF) methodology to develop mean- weighted case specific random forests (MWCSRF) to estimate the average treatment effect for the treated (ATT). We consider a setting under which the data contains many control and very few treated units, and covariate space for the treated units is a small subspace of that for the control units. For example, treated units may be those that underwent an experimental procedure and control units may be the set of units in a national database. Our approach is as follows. First, we compute bootstrap sample weights for each treated unit to oversample control units nearby the treated unit. Then, we average these weights together to construct one set of “treated” sample weights. Next, we use random forests to estimate the prognostic score—the expected control outcome given a set of covariates— for each treated unit. Finally, we estimate the ATT by taking the average difference of the responses and the estimated prognostic scores across all treated units. We show via a simulation study that MWCSRF performs favorably compared to the standard random forest, causal forests, and genetic matching under both homogeneous and heterogeneous treatment effect settings, especially when the number of treated units is small. Additionally, we demonstrate that, when parallelization is not available, MWCSRF requires significantly less runtime than CSRF. We confirm our findings on a study on the efficacy of the National Supported Work Demonstration. Additionally, we develop an R package for MWCSRF. Secondly, we discuss the problem of treatment spillover in the context of Fisher’s Lady Tasting Tea experiment. We show that, by design, Lady Tasting Tea can violate the stable unit treatment value assumption (SUTVA), which requires the response of a unit to be only affected by the treatment status of that unit. We show that SUTVA may be violated under this model even when, for a given cup, the Lady’s milk-first likelihood is always higher when that cup actually receives milk first. Moreover, we show that SUTVA holds under two conditions: one in which the Lady’s likelihood for a cup is the same regardless of whether that cup was given milk first or tea first, and one in which the Lady always makes perfect guesses. These results further emphasize that SUTVA cannot be classified solely as treatment spillover problems, but can be inherent in the design of an experiment. Additionally, this result may have implications for teaching causal inference, as it may be preferable to introduce randomized experiments using examples that do not inherently violate SUTVA.en_US
dc.description.advisorMichael J. Higginsen_US
dc.description.degreeDoctor of Philosophyen_US
dc.description.departmentDepartment of Statisticsen_US
dc.description.levelDoctoralen_US
dc.identifier.urihttps://hdl.handle.net/2097/41626
dc.language.isoen_USen_US
dc.subjectCausal inferenceen_US
dc.subjectRandom foresten_US
dc.subjectStable unit treatment value assumptionen_US
dc.subjectMean-weightingen_US
dc.subjectCounterfactualen_US
dc.titleMean-weighted case specific random forests for estimating causal effectsen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
LinusAddae2021.pdf
Size:
2.46 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: