Using statistical learning to predict survival of passengers on the RMS Titanic

dc.contributor.authorWhitley, Michael Aaron
dc.date.accessioned2015-11-19T19:22:08Z
dc.date.available2015-11-19T19:22:08Z
dc.date.graduationmonthDecemberen_US
dc.date.issued2015-12-01en_US
dc.date.published2015en_US
dc.description.abstractWhen exploring data, predictive analytics techniques have proven to be effective. In this report, the efficiency of several predictive analytics methods are explored. During the time of this study, Kaggle.com, a data science competition website, had the predictive modeling competition, "Titanic: Machine Learning from Disaster" available. This competition posed a classification problem to build a predictive model to predict the survival of passengers on the RMS Titanic. The focus of our approach was on applying a traditional classification and regression tree algorithm. The algorithm is greedy and can over fit the training data, which consequently can yield non-optimal prediction accuracy. In efforts to correct such issues with using the classification and regression tree algorithm, we have implemented cost complexity pruning and ensemble methods such as bagging and random forests. However, no improvement was observed here which may be an artifact associated with the Titanic data and may not be representative of those methods’ performances. The decision trees and prediction accuracy of each method are presented and compared. Results indicate that the predictors sex/title, fare price, age, and passenger class are the most important variables in predicting survival of the passengers.en_US
dc.description.advisorChristopher I. Vahlen_US
dc.description.degreeMaster of Scienceen_US
dc.description.departmentStatisticsen_US
dc.description.levelMastersen_US
dc.identifier.urihttp://hdl.handle.net/2097/20541
dc.language.isoen_USen_US
dc.publisherKansas State Universityen
dc.subjectDecision treeen_US
dc.subjectEnsembleen_US
dc.subjectKaggleen_US
dc.subjectTitanicen_US
dc.subject.umiStatistics (0463)en_US
dc.titleUsing statistical learning to predict survival of passengers on the RMS Titanicen_US
dc.typeReporten_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MichaelWhitley2015.pdf
Size:
694.03 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: