Predictive analytics for classification of immigration visa applications: a discriminative machine learning approach

K-REx Repository

Show simple item record

dc.contributor.author Vegesana, Sharmila
dc.date.accessioned 2018-04-19T18:41:53Z
dc.date.available 2018-04-19T18:41:53Z
dc.date.issued 2018-05-01 en_US
dc.identifier.uri http://hdl.handle.net/2097/38822
dc.description.abstract This work focuses on the data science challenge problem of predicting the decision for past immigration visa applications using supervised machine learning for classification. I describe an end-to-end approach that first prepares historical data for supervised inductive learning, trains various discriminative models, and evaluates these models using simple statistical validation methods. The H-1B visa allows employers in the United States to temporarily employ foreign nationals in various specialty occupations that require a bachelor’s degree or higher in the specific specialty, or its equivalents. These specialty occupations may often include, but are not limited to: medicine, health, journalism, and areas of science, technology, engineering and mathematics (STEM). Every year the United States Citizenship and Immigration Service (USCIS) grants a current maximum of 85,000 visas, even though the number of applicants surpasses this amount by a huge difference and this selection process is claimed to be a lottery system. The dataset used for this experimental research project contains all the petitions made for this visa cap from the year 2011 to 2016. This project aims at using discriminative machine learning techniques to classify these petitions and predict the “case status” of each petition based on various factors. Exploratory data analysis is also done to determine the top employers, the locations which most appeal for foreign nationals under this visa cap and the job roles which have the highest number of foreign workers. I apply supervised inductive learning algorithms such as Gaussian Naïve Bayes, Logistic Regression, and Random Forests to identify the most probable factors for H-1B visa certifications and compare the results of each to determine the best predictive model for this testbed. en_US
dc.language.iso en_US en_US
dc.subject Classification en_US
dc.subject Machine learning en_US
dc.title Predictive analytics for classification of immigration visa applications: a discriminative machine learning approach en_US
dc.type Report en_US
dc.description.degree Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computer Science en_US
dc.description.advisor William Hsu en_US
dc.date.published 2018 en_US
dc.date.graduationmonth May en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


Advanced Search

Browse

My Account

Statistics








Center for the

Advancement of Digital

Scholarship

118 Hale Library

Manhattan KS 66506


(785) 532-7444

cads@k-state.edu