Data envelopment analysis with sparse data

Date

2011-11-15

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

Quest for continuous improvement among the organizations and issue of missing data for data analysis are never ending. This thesis brings these two topics under one roof, i.e., to evaluate the productivity of organizations with sparse data. This study focuses on Data Envelopment Analysis (DEA) to determine the efficiency of 41 member clinics of Kansas Association of Medically Underserved (KAMU) with missing data. The primary focus of this thesis is to develop new reliable methods to determine the missing values and to execute DEA. DEA is a linear programming methodology to evaluate relative technical efficiency of homogenous Decision Making Units, using multiple inputs and outputs. Effectiveness of DEA depends on the quality and quantity of data being used. DEA outcomes are susceptible to missing data, thus, creating a need to supplement sparse data in a reliable manner. Determining missing values more precisely improves the robustness of DEA methodology. Three methods to determine the missing values are proposed in this thesis based on three different platforms. First method named as Average Ratio Method (ARM) uses average value, of all the ratios between two variables. Second method is based on a modified Fuzzy C-Means Clustering algorithm, which can handle missing data. The issues associated with this clustering algorithm are resolved to improve its effectiveness. Third method is based on interval approach. Missing values are replaced by interval ranges estimated by experts. Crisp efficiency scores are identified in similar lines to how DEA determines efficiency scores using the best set of weights. There exists no unique way to evaluate the effectiveness of these methods. Effectiveness of these methods is tested by choosing a complete dataset and assuming varying levels of data as missing. Best set of recovered missing values, based on the above methods, serves as a source to execute DEA. Results show that the DEA efficiency scores generated with recovered values are close within close proximity to the actual efficiency scores that would be generated with the complete data. As a summary, this thesis provides an effective and practical approach for replacing missing values needed for DEA.

Description

Keywords

Data envelopment analysis, Sparse data, Missing values, Healthcare, Clustering, Fuzzy Set Theory

Graduation Month

December

Degree

Master of Science

Department

Department of Industrial & Manufacturing Systems Engineering

Major Professor

David H. Ben-Arieh

Date

2011

Type

Thesis

Citation