Topic modeling using latent dirichlet allocation on disaster tweets
dc.contributor.author | Patel, Virashree Hrushikesh | |
dc.date.accessioned | 2018-11-19T16:28:46Z | |
dc.date.available | 2018-11-19T16:28:46Z | |
dc.date.graduationmonth | December | en_US |
dc.date.issued | 2018-12-01 | |
dc.date.published | 2018 | en_US |
dc.description.abstract | Social media has changed the way people communicate information. It has been noted that social media platforms like Twitter are increasingly being used by people and authorities in the wake of natural disasters. The year 2017 was a historic year for the USA in terms of natural calamities and associated costs. According to NOAA (National Oceanic and Atmospheric Administration), during 2017, USA experienced 16 separate billion-dollar disaster events, including three tropical cyclones, eight severe storms, two inland floods, a crop freeze, drought, and wild re. During natural disasters, due to the collapse of infrastructure and telecommunication, often it is hard to reach out to people in need or to determine what areas are affected. In such situations, Twitter can be a lifesaving tool for local government and search and rescue agencies. Using Twitter streaming API service, disaster-related tweets can be collected and analyzed in real-time. Although tweets received from Twitter can be sparse, noisy and ambiguous, some may contain useful information with respect to situational awareness. For example, some tweets express emotions, such as grief, anguish, or call for help, other tweets provide information specific to a region, place or person, while others simply help spread information from news or environmental agencies. To extract information useful for disaster response teams from tweets, disaster tweets need to be cleaned and classified into various categories. Topic modeling can help identify topics from the collection of such disaster tweets. Subsequently, a topic (or a set of topics) will be associated with a tweet. Thus, in this report, we will use Latent Dirichlet Allocation (LDA) to accomplish topic modeling for disaster tweets dataset. | en_US |
dc.description.advisor | Cornelia Caragea | en_US |
dc.description.advisor | Doina Caragea | en_US |
dc.description.degree | Master of Science | en_US |
dc.description.department | Department of Computer Science | en_US |
dc.description.level | Masters | en_US |
dc.identifier.uri | http://hdl.handle.net/2097/39337 | |
dc.language.iso | en_US | en_US |
dc.subject | topic modeling | en_US |
dc.subject | en_US | |
dc.subject | latent dirichlet allocation | |
dc.subject | LDA | |
dc.title | Topic modeling using latent dirichlet allocation on disaster tweets | en_US |
dc.type | Report | en_US |