Finding common support and assessing matching methods for causal inference

Date

2017-08-01

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

This dissertation presents an approach to assess and validate causal inference tools to es- timate the causal effect of a treatment. Finding treatment effects in observational studies is complicated by the need to control for confounders. Common approaches for controlling include using prognostically important covariates to form groups of similar units containing both treatment and control units or modeling responses through interpolation. This disser- tation proposes a series of new, computationally efficient methods to improve the analysis of observational studies. Treatment effects are only reliably estimated for a subpopulation under which a common support assumption holds—one in which treatment and control covariate spaces overlap. Given a distance metric measuring dissimilarity between units, a graph theory is used to find common support. An adjacency graph is constructed where edges are drawn between similar treated and control units to determine regions of common support by finding the largest connected components (LCC) of this graph. The results show that LCC improves on existing methods by efficiently constructing regions that preserve clustering in the data while ensuring interpretability of the region through the distance metric. This approach is extended to propose a new matching method called largest caliper matching (LCM). LCM is a version of cardinality matching—a type of matching used to maximize the number of units in an observational study under a covariate balance constraint between treatment groups. While traditional cardinality matching is an NP-hard, LCM can be completed in polynomial time. The performance of LCM with other five popular matching methods are shown through a series of Monte Carlo simulations. The performance of the simulations is measured by the bias, empirical standard deviation and the mean square error of the estimates under different treatment prevalence and different distributions of covariates. The formed matched samples improve estimation of the population treatment effect in a wide range of settings, and suggest cases in which certain matching algorithms perform better than others. Finally, this dissertation presents an application of LCC and matching methods on a study of the effectiveness of right heart catheterization (RHC) and find that clinical outcomes are significantly worse for patients that undergo RHC.

Description

Keywords

Causal inference

Graduation Month

August

Degree

Doctor of Philosophy

Department

Department of Statistics

Major Professor

Michael J. Higgins

Date

2017

Type

Dissertation

Citation