Analysis of patient satisfaction survey data

K-REx Repository

Show simple item record Gaumer, Jason M. 2019-05-08T15:08:16Z 2019-05-08T15:08:16Z 2019-08-01
dc.description.abstract We analyzed a dataset provided by an anonymous hospital in the Midwest, for the purpose of identifying characteristics that affect two response variables of interest: Topbox Overall score and Advocacy. Topbox Overall score is when patients rate the hospital as a 9 or 10 for an overall patient satisfaction score. Advocacy is when patients say “Yes” they would recommend the hospital to a close family member or friend. Since both Topbox Overall score and Advocacy are binary variables, we will use a logistic model for each response. The dataset contains 434 observations and 21 potential predictors. Most predictors are on an ordinal scale and contain many missing values. Ordinal predictors were converted to a Likert scale and treated as numeric reducing the number of parameters required to fit the logistic models. Missing values were examined to determine the cause of missingness, and most were found to be missing because they were not applicable. These missing values were changed to zero on the Likert scale, which allowed the affected observations to remain in the analysis. In total, 16 observations were removed from the analysis due to missing values leaving 418 observations to be used in the model building process. We used several different variable selection techniques to generate suitable models for the two distinct response variables: Topbox Overall score and Advocacy. These techniques were needed to identify a parsimonious model. Forward selection and backward elimination were used with a penalized AIC. These are two common techniques for variable selection. Variable selection was also performed using backward elimination via the p-value approach. For this technique the p-value was computed using the chi-squared distribution. Different techniques were used to determine if the results could be replicated. The same models were identified using all three techniques. After the reduced models were identified, two processes were used for model checking: Cook’s distances and the Hosmer-Lemeshow test. The Cook’s distances identified no influential points or outliers, and the Hosmer-Lemeshow test indicated that the logistic models were appropriate for both response variables. The variable selection process resulted in three predictors for the Topbox Overall score and two predictors for Advocacy. Using these predictors, a full interaction model was generated for each response. None of the interactions were significant, so the additive models were accepted as the final models. For Topbox Overall score, the three predictors identified were clear communication by nurses, received care within 30 minutes of arriving in the emergency room, and the doctors spent enough time with the patients. For Advocacy two predictors were identified the doctors listened carefully and nurses spent enough time with patients. In the two models both had predictors that involved the doctors and nurses, but the variables were not exactly the same. Variables related to communication and time spent with the patient were important themes for both models. Timeliness of care had a greater impact on Topbox Overall score than on Advocacy. en_US
dc.language.iso en_US en_US
dc.subject Statistics en_US
dc.subject Logistic model en_US
dc.subject Variable selection en_US
dc.subject Patient en_US
dc.subject Survey en_US
dc.subject Data en_US
dc.title Analysis of patient satisfaction survey data en_US
dc.type Report en_US Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Statistics en_US
dc.description.advisor Karen Keating en_US 2019 en_US August en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx

Advanced Search


My Account


Center for the

Advancement of Digital