Predictive analytics for classification of immigration visa applications: a discriminative machine learning approach

K-REx Repository

Show simple item record Vegesana, Sharmila 2018-04-19T18:41:53Z 2018-04-19T18:41:53Z 2018-05-01 en_US
dc.description.abstract This work focuses on the data science challenge problem of predicting the decision for past immigration visa applications using supervised machine learning for classification. I describe an end-to-end approach that first prepares historical data for supervised inductive learning, trains various discriminative models, and evaluates these models using simple statistical validation methods. The H-1B visa allows employers in the United States to temporarily employ foreign nationals in various specialty occupations that require a bachelor’s degree or higher in the specific specialty, or its equivalents. These specialty occupations may often include, but are not limited to: medicine, health, journalism, and areas of science, technology, engineering and mathematics (STEM). Every year the United States Citizenship and Immigration Service (USCIS) grants a current maximum of 85,000 visas, even though the number of applicants surpasses this amount by a huge difference and this selection process is claimed to be a lottery system. The dataset used for this experimental research project contains all the petitions made for this visa cap from the year 2011 to 2016. This project aims at using discriminative machine learning techniques to classify these petitions and predict the “case status” of each petition based on various factors. Exploratory data analysis is also done to determine the top employers, the locations which most appeal for foreign nationals under this visa cap and the job roles which have the highest number of foreign workers. I apply supervised inductive learning algorithms such as Gaussian Naïve Bayes, Logistic Regression, and Random Forests to identify the most probable factors for H-1B visa certifications and compare the results of each to determine the best predictive model for this testbed. en_US
dc.language.iso en_US en_US
dc.subject Classification en_US
dc.subject Machine learning en_US
dc.title Predictive analytics for classification of immigration visa applications: a discriminative machine learning approach en_US
dc.type Report en_US Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computer Science en_US
dc.description.advisor William Hsu en_US 2018 en_US May en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx

Advanced Search


My Account


Center for the

Advancement of Digital


118 Hale Library

Manhattan KS 66506

(785) 532-7444