Prediction of university student attrition rate using Ridge and Lasso Regression

K-REx Repository

Show simple item record Vallabhaneni, Teja Usha Sree 2019-04-15T21:56:04Z 2019-04-15T21:56:04Z 2019-05-01
dc.description.abstract One of the major challenges faced by many institutions is the attrition rate. Institutional attrition is the phenomenon of individuals moving out of an institution, prior to completing term-limited programs; this term can apply to employees (e.g., postdoctoral fellows) or students (Bani, J., Haji, & M.,, 2017). In the context of this project, which focuses on student attrition, it includes students who drop out, are dismissed or do not return to their studies before the completion of their degree. The student attrition rate at a university is often measured in terms of the net change in enrollment per year due to students discontinuing their studies at that university. One of the consequences of attrition is that students are unable to graduate despite significant investments in the form of funding from scholarship-granting institutions or governments. This project deals with the study of factors contributing to student attrition rate at a land-grant state university and predicting whether a student is going to drop out or not based on various factors such as gender, race, cumulative GPA, etc. One reason that this study is timely and necessary is that a predictive model may allow an institution to recognize factors contributing to dropping out and will help the institution retain students and prevent dropout and “stopout” in certain cases. A decrease in preventable attrition may similarly enable more students to earn the degrees they were pursuing at a point where they can realize more of the professional and economic benefits of that degree. The report comprises a brief review of the supporting literature for the task of student attrition rate prediction and describes a machine learning and data science project centered around further explorations of a previously-developed experimental test bed. These involve extraction of data from historical archives (raw data from the university registrar’s office and other sources), cleaning the data, building the testing and training data sets for the supervised learning algorithms, training, and evaluation of models, and review of the models to derive actionable insights. Logistic regression, a supervised inductive learning algorithm, is used to train a classification model, which in turn is used to predict student dropout on a case wise basis. Regression models that use L2 regularization (ridge regression) and L1 regularization (lasso regression) will also be used to predict student dropout. These algorithms are used in feature selection and in the creation of a flexible model when data consists of a large set of features. Performance metrics such as precision, accuracy, recall, and F1 score are used to compare the performance. en_US
dc.language.iso en_US en_US
dc.subject Student Attrition Rate en_US
dc.subject University Attrition Rate en_US
dc.subject Machine learning en_US
dc.subject Classification en_US
dc.title Prediction of university student attrition rate using Ridge and Lasso Regression en_US
dc.type Report en_US Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computer Science en_US
dc.description.advisor William H. Hsu en_US 2019 en_US May en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


My Account


Center for the

Advancement of Digital