Automated malware analysis for Android applications through raw bytecode
dc.contributor.author | Hauser, Joy | |
dc.date.accessioned | 2021-05-12T13:25:35Z | |
dc.date.available | 2021-05-12T13:25:35Z | |
dc.date.graduationmonth | August | |
dc.date.issued | 2021 | |
dc.description.abstract | Securing mobile phone applications is one of the large areas of research based on the wide spread of mobile phones today. Android encourages developers to make Java applications to run on Android devices. While this provides developers with a lot of freedom, this provides the same opportunity to malware authors. Therefore, defenses need to be put in place to determine which applications are malicious or benign. Additionally, an automatic way to determine if applications are malicious needs to be put in place given the massive amount of applications that incident responders would need to review. To address the question of how to determine if an application is malicious, this thesis approached the problem by utilizing a LSTM model. This approach was utilized to determine if treating individual Java bytecode instructions as "words'' in a sentence for an NLP task would provide decent performance compared to the expectations for this dataset. A logistic regression model was utilized to provide a baseline measurement for what the expected results were. Six different configurations were attempted for both of the models to determine which configuration provided the best performance for the applications pulled from the Androzoo repository. The LSTM model achieved very similar performance across all six experiments, with only the loss value changing. An accuracy of 0.9, a precision of 0.933, a recall of 0.83, a F1-score of 0.841, and a loss of 0.332 were the results of the best configuration for the LSTM. The equivalent logistic regression experiment resulted in 10.198 loss, 0.86 accuracy, 0.733 precision, 0.75 recall, and 0.731 F1-score. The LSTM model performed better than the logistic regression model, but increasing the amount of input may provide better results. | |
dc.description.advisor | George Amariucai | |
dc.description.degree | Master of Science | |
dc.description.department | Department of Computer Science | |
dc.description.level | Masters | |
dc.identifier.uri | https://hdl.handle.net/2097/41536 | |
dc.language.iso | en_US | |
dc.publisher | Kansas State University | |
dc.rights | © the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s). | |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | |
dc.subject | Android | |
dc.subject | Malware | |
dc.subject | LSTM | |
dc.subject | Java bytecode | |
dc.subject | Logistic regression | |
dc.subject | Malware analysis | |
dc.title | Automated malware analysis for Android applications through raw bytecode | |
dc.type | Thesis |