Twitter data analysis to enhance Android malware detection

Date

2020-12-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In recent years, we have witnessed a proliferation of mobile applications (or apps), including useful, benign apps, and also malicious apps (or malware). Identifying malicious apps is a challenging but urgent problem, as malicious apps can cause significant damage and financial losses to their users. Most systems for identifying malware rely on features extracted from the code of the apps themselves using static or dynamic analysis. However, many zero-day malware apps still evade such systems and enter the market. To complement the information contained in the code and facilitate the detection of zero-day Android malware apps, we propose to use social media information, specifically, Twitter to identify tweets that talk about Android malware, in particular those that may contribute to the spread of the malware. The assumption is that users who try to advertise and/or spread malware share the characteristics of spam users. We have used Twitter Developer’s APIs to crawl a large number of tweets that contain URLs corresponding to Android apps. The tweets, together with meta-information about their retweets/favorites and about their users, have been stored in a MongoDB database. The URLs in the collection of tweets collected have been matched with Android apps using information crawled from Google PlayStore. Furthermore, the apps found in tweets that were matched to apps in Google PlayStore have been labeled as benign or malware using a platform called AndroZoo, which uses anti-virus programs such as Virus Total to identify malware. Finally, Twitter users who post malware are being studied to identify patterns characteristic of spam users, which could potentially be used to identify zero-day malware.

Description

Keywords

Twitter data, Android applications, Google playstore

Graduation Month

December

Degree

Master of Science

Department

Department of Computer Science

Major Professor

Doina Caragea

Date

2020

Type

Thesis

Citation