Machine learning and data science for a household-specific poverty level prediction task

dc.contributor.authorVenkatramolla, Sudesh Kumar
dc.date.accessioned2019-04-16T20:57:18Z
dc.date.available2019-04-16T20:57:18Z
dc.date.graduationmonthMay
dc.date.issued2019-05-01
dc.description.abstractThis project focuses on a prediction task from the Kaggle data science challenge site: prediction of the poverty level of individual households using supervised classification learning. In Latin America, the Proxy Means Test (PMT) is the most popular method used to verify the income qualification. The PMT works by considering the observable properties of a household, such as the walls, ceilings, and electric devices in a family home. These and other general assets are used to classify the poverty level, assigning one of the four labels: (1) extreme poverty, (2) moderate poverty, (3) vulnerable households and (4) non-vulnerable households. The accuracy of learned classification models submitted as solutions to this data challenge has tended to decrease as a function of dataset size. Therefore, in this project, I am focusing on methods for boosting accuracy in detecting poverty level using committee machines (bagging, boosting, etc.) for supervised inductive learning. Because the task is classification learning, my first approach is to apply random forests (a decision tree ensemble method); depending on the accuracy, I will proceed with the advanced methods, such as light gradient-boosting methods (GBMs) and neural networks that are frequently used on large, complex multivariate classification tasks. The inference task is to predict the poverty level of a new household using attributes of the family home and other attributes found to be relevant by the learning algorithm. This enables use of cases of artificial intelligence for social good, such as helping governments and relief and economic development agencies to identify communities in need.
dc.description.advisorWilliam H. Hsu
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Computer Science
dc.description.levelMasters
dc.identifier.urihttp://hdl.handle.net/2097/39520
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectMachine Learning
dc.subjectData Science
dc.subjectPrediction
dc.subjectClassification
dc.titleMachine learning and data science for a household-specific poverty level prediction task
dc.typeReport

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SudeshkumarVenkatramolla2019.pdf
Size:
486.93 KB
Format:
Adobe Portable Document Format
Description:
Masters Report

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: