Automatic detection of significant features and event timeline construction from temporally tagged data
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The goal of my project is to summarize large volumes of data and help users to visualize how events have unfolded over time. I address the problem of extracting overview terms from a time-tagged corpus of data and discuss some previous work conducted in this area. I use a statistical approach to automatically extract key terms, form groupings of related terms, and display the resultant groups on a timeline. I use a static corpus composed of news stories, as opposed to an on-line setting where continual additions to the corpus are being made. Terms are extracted using a Named Entity Recognizer, and importance of a term is determined using the [superscript]X[superscript]2 measure. My approach does not address the problem of associating time and date stamps with data, and is restricted to corpora that been explicitly tagged. The quality of results obtained is gauged subjectively and objectively by measuring the degree to which events known to exist in the corpus were identified by the system.