Machine learning and data science for a household-specific poverty level prediction task

Venkatramolla, Sudesh Kumar

Machine learning and data science for a household-specific poverty level prediction task

Files

SudeshkumarVenkatramolla2019.pdf (486.93 KB)

Date

2019-05-01

Authors

Venkatramolla, Sudesh Kumar

Publisher

Kansas State University

Abstract

This project focuses on a prediction task from the Kaggle data science challenge site: prediction of the poverty level of individual households using supervised classification learning. In Latin America, the Proxy Means Test (PMT) is the most popular method used to verify the income qualification. The PMT works by considering the observable properties of a household, such as the walls, ceilings, and electric devices in a family home. These and other general assets are used to classify the poverty level, assigning one of the four labels: (1) extreme poverty, (2) moderate poverty, (3) vulnerable households and (4) non-vulnerable households. The accuracy of learned classification models submitted as solutions to this data challenge has tended to decrease as a function of dataset size. Therefore, in this project, I am focusing on methods for boosting accuracy in detecting poverty level using committee machines (bagging, boosting, etc.) for supervised inductive learning. Because the task is classification learning, my first approach is to apply random forests (a decision tree ensemble method); depending on the accuracy, I will proceed with the advanced methods, such as light gradient-boosting methods (GBMs) and neural networks that are frequently used on large, complex multivariate classification tasks. The inference task is to predict the poverty level of a new household using attributes of the family home and other attributes found to be relevant by the learning algorithm. This enables use of cases of artificial intelligence for social good, such as helping governments and relief and economic development agencies to identify communities in need.

Keywords

Machine Learning, Data Science, Prediction, Classification

Graduation Month

May

Degree

Master of Science

Department

Department of Computer Science

Major Professor

William H. Hsu

Type

Report

URI

http://hdl.handle.net/2097/39520

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -

Full item page

Machine learning and data science for a household-specific poverty level prediction task

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Graduation Month

Degree

Department

Major Professor

Date

Type

Citation

URI

Collections