Predictive analytics of institutional attrition

K-REx Repository

Show simple item record Velumula, Sindhu 2018-11-16T21:46:50Z 2018-11-16T21:46:50Z 2018-12-01
dc.description.abstract Institutional attrition refers to the phenomenon of members of an organization leaving it over time - a costly challenge faced by many institutions. This work focuses on the problem of predicting attrition as an application of supervised machine learning for classification using summative historical variables. Raising the accuracy, precision, and recall of learned classifiers enables institutional administrators to take individualized preventive action based on the variables that are found to be relevant to the prediction that a particular member is at high risk of departure. This project focuses on using multivariate logistic regression on historical institutional data with wrapper-based feature selection to determine variables that are relevant to a specified classification task for prediction of attrition. In this work, I first describe a detailed approach to the development of a machine learning pipeline for a range of predictive analytics tasks such as anticipating employee or student attrition. These include: data preparation for supervised inductive learning tasks; training various discriminative models; and evaluating these models using performance metrics such as precision, accuracy, and specificity/sensitivity analysis. Next, I document a synthetic human resource dataset created by data scientists at IBM for simulating employee performance and attrition. I then apply supervised inductive learning algorithms such as logistic regression, support vector machines (SVM), random forests, and Naive Bayes to predict the attrition of individual employees based on a combination of personal and institution-wide factors. I compare the results of each algorithm to evaluate the predictive models for this classification task. Finally, I generate basic visualizations common to many analytics dashboards, comprising results such as heat maps of the confusion matrix and the comparative accuracy, precision, recall and F1 score for each algorithm. From an applications perspective, once deployed, this model can be used by human capital services units of an employer to find actionable ways (training, management, incentives, etc.) to reduce attrition and potentially boost longer-term retention. en_US
dc.language.iso en_US en_US
dc.subject Classification en_US
dc.subject Machine learning en_US
dc.title Predictive analytics of institutional attrition en_US
dc.type Report en_US Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computer Science en_US
dc.description.advisor William H. Hsu en_US 2018 en_US December en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


My Account


Center for the

Advancement of Digital