Finding common support through largest connected components and its implementation

Date

2019-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In an observational study, the average treatment effect may only be reliably estimated for a subset of units under which the covariate space of both treatment and control units overlap. This is known as the common support assumption. In this report, we develop a method to find a region of common support. The method is as follows. Given a distance function to measure dissimilarity between any two units with differing treatment statuses, we can construct an adjacency list by drawing edges between each pair of treated and control units that have distance no larger than some pre-specified threshold. Then, all connected components of the graph are found. Finally, a region of common support is found by obtain- ing the largest connected components (LCC) (e.g. the connected components with the most treated units) of this graph. We implement the LCC algorithm by using binary search trees to find all the connected graphs from sample data and sorting them by size. This algorithm requires O(n²) runtime and O(n) memory (where n is the number of units in the observational study. We then create an R package implementing this LCC algorithm. Finally, we use our R package to compare the performance of LCC to that of other common support methods on simulated data.

Description

Keywords

Connected components, LCC, Observational study, Covariate, Propensity score, ATT, ATE, Treatment, Control, Treatment effect

Graduation Month

May

Degree

Master of Science

Department

Department of Statistics

Major Professor

Michael Higgins

Date

2019

Type

Report

Citation