Prediction of university student attrition rate using Ridge and Lasso Regression

dc.contributor.authorVallabhaneni, Teja Usha Sree
dc.date.accessioned2019-04-15T21:56:04Z
dc.date.available2019-04-15T21:56:04Z
dc.date.graduationmonthMayen_US
dc.date.issued2019-05-01
dc.date.published2019en_US
dc.description.abstractOne of the major challenges faced by many institutions is the attrition rate. Institutional attrition is the phenomenon of individuals moving out of an institution, prior to completing term-limited programs; this term can apply to employees (e.g., postdoctoral fellows) or students (Bani, J., Haji, & M., https://pdfs.semanticscholar.org/94b1/, 2017). In the context of this project, which focuses on student attrition, it includes students who drop out, are dismissed or do not return to their studies before the completion of their degree. The student attrition rate at a university is often measured in terms of the net change in enrollment per year due to students discontinuing their studies at that university. One of the consequences of attrition is that students are unable to graduate despite significant investments in the form of funding from scholarship-granting institutions or governments. This project deals with the study of factors contributing to student attrition rate at a land-grant state university and predicting whether a student is going to drop out or not based on various factors such as gender, race, cumulative GPA, etc. One reason that this study is timely and necessary is that a predictive model may allow an institution to recognize factors contributing to dropping out and will help the institution retain students and prevent dropout and “stopout” in certain cases. A decrease in preventable attrition may similarly enable more students to earn the degrees they were pursuing at a point where they can realize more of the professional and economic benefits of that degree. The report comprises a brief review of the supporting literature for the task of student attrition rate prediction and describes a machine learning and data science project centered around further explorations of a previously-developed experimental test bed. These involve extraction of data from historical archives (raw data from the university registrar’s office and other sources), cleaning the data, building the testing and training data sets for the supervised learning algorithms, training, and evaluation of models, and review of the models to derive actionable insights. Logistic regression, a supervised inductive learning algorithm, is used to train a classification model, which in turn is used to predict student dropout on a case wise basis. Regression models that use L2 regularization (ridge regression) and L1 regularization (lasso regression) will also be used to predict student dropout. These algorithms are used in feature selection and in the creation of a flexible model when data consists of a large set of features. Performance metrics such as precision, accuracy, recall, and F1 score are used to compare the performance.en_US
dc.description.advisorWilliam H. Hsuen_US
dc.description.degreeMaster of Scienceen_US
dc.description.departmentDepartment of Computer Scienceen_US
dc.description.levelMastersen_US
dc.identifier.urihttp://hdl.handle.net/2097/39511
dc.language.isoen_USen_US
dc.subjectStudent Attrition Rateen_US
dc.subjectUniversity Attrition Rateen_US
dc.subjectMachine learningen_US
dc.subjectClassificationen_US
dc.titlePrediction of university student attrition rate using Ridge and Lasso Regressionen_US
dc.typeReporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TejaushasreeVallabhaneni2019.pdf
Size:
845.04 KB
Format:
Adobe Portable Document Format
Description:
Report
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: