Cross-language tweet classification using Bing translator

Date

2018-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

Social media affects our daily lives. It is one of the first sources for finding breaking news. In particular, Twitter is one of the popular social media platforms, with around 330 million monthly users. From local events such as Fake Patty's Day to across the world happenings - Twitter gets there first. During a disaster, tweets can be used to post warnings, status of available medical and food supply, emergency personnel, and updates. Users were practically tweeting about the Hurricane Sandy, despite lack of network during the storm. Analysis of these tweets can help monitor the disaster, plan and manage the crisis, and aid in research. In this research, we use the publicly available tweets posted during several disasters and identify the relevant tweets. As the languages in the datasets are different, Bing translation API has been used to detect and translate the tweets. The translations are then, used as training datasets for supervised machine learning algorithms. Supervised learning is the process of learning from a labeled training dataset. This learned classifier can then be used to predict the correct output for any valid input. When trained to more observations, the algorithm improves its predictive performance.

Description

Keywords

Disaster, Twitter, Text clasification, Microsoft text translator API, Cross-validation

Graduation Month

May

Degree

Master of Science

Department

Department of Computing and Information Sciences

Major Professor

Doina Caragea

Date

2018

Type

Report

Citation