Predictive analytics of institutional attrition

dc.contributor.authorVelumula, Sindhu
dc.date.accessioned2018-11-16T21:46:50Z
dc.date.available2018-11-16T21:46:50Z
dc.date.graduationmonthDecemberen_US
dc.date.issued2018-12-01
dc.date.published2018en_US
dc.description.abstractInstitutional attrition refers to the phenomenon of members of an organization leaving it over time - a costly challenge faced by many institutions. This work focuses on the problem of predicting attrition as an application of supervised machine learning for classification using summative historical variables. Raising the accuracy, precision, and recall of learned classifiers enables institutional administrators to take individualized preventive action based on the variables that are found to be relevant to the prediction that a particular member is at high risk of departure. This project focuses on using multivariate logistic regression on historical institutional data with wrapper-based feature selection to determine variables that are relevant to a specified classification task for prediction of attrition. In this work, I first describe a detailed approach to the development of a machine learning pipeline for a range of predictive analytics tasks such as anticipating employee or student attrition. These include: data preparation for supervised inductive learning tasks; training various discriminative models; and evaluating these models using performance metrics such as precision, accuracy, and specificity/sensitivity analysis. Next, I document a synthetic human resource dataset created by data scientists at IBM for simulating employee performance and attrition. I then apply supervised inductive learning algorithms such as logistic regression, support vector machines (SVM), random forests, and Naive Bayes to predict the attrition of individual employees based on a combination of personal and institution-wide factors. I compare the results of each algorithm to evaluate the predictive models for this classification task. Finally, I generate basic visualizations common to many analytics dashboards, comprising results such as heat maps of the confusion matrix and the comparative accuracy, precision, recall and F1 score for each algorithm. From an applications perspective, once deployed, this model can be used by human capital services units of an employer to find actionable ways (training, management, incentives, etc.) to reduce attrition and potentially boost longer-term retention.en_US
dc.description.advisorWilliam H. Hsuen_US
dc.description.degreeMaster of Scienceen_US
dc.description.departmentDepartment of Computer Scienceen_US
dc.description.levelMastersen_US
dc.identifier.urihttp://hdl.handle.net/2097/39334
dc.language.isoen_USen_US
dc.subjectClassificationen_US
dc.subjectMachine learningen_US
dc.titlePredictive analytics of institutional attritionen_US
dc.typeReporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SindhuVelumula2018.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
Report
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: