Entity extraction, animal disease-related event recognition and classification from web

K-REx Repository

Show simple item record

dc.contributor.author Volkova, Svitlana
dc.date.accessioned 2010-08-10T13:18:27Z
dc.date.available 2010-08-10T13:18:27Z
dc.date.issued 2010-08-10T13:18:27Z
dc.identifier.uri http://hdl.handle.net/2097/4593
dc.description.abstract Global epidemic surveillance is an essential task for national biosecurity management and bioterrorism prevention. The main goal is to protect the public from major health threads. To perform this task effectively one requires reliable, timely and accurate medical information from a wide range of sources. Towards this goal, we present a framework for epidemiological analytics that can be used to extract and visualize infectious disease outbreaks from the variety of unstructured web sources automatically. More precisely, in this thesis, we consider several research tasks including document relevance classification, entity extraction and animal disease-related event recognition in the veterinary epidemiology domain. First, we crawl web sources and classify collected documents by topical relevance using supervised learning algorithms. Next, we propose a novel approach for automated ontology construction in the veterinary medicine domain. Our approach is based on semantic relationship discovery using syntactic patterns. We then apply our automatically-constructed ontology for the domain-specific entity extraction task. Moreover, we compare our ontology-based entity extraction results with an alternative sequence labeling approach. We introduce a sequence labeling method for the entity tagging that relies on syntactic feature extraction using a sliding window. Finally, we present our novel sentence-based event recognition approach that includes three main steps: entity extraction of animal diseases, species, locations, dates and the confirmation status n-grams; event-related sentence classification into two categories - suspected or confirmed; automated event tuple generation and aggregation. We show that our document relevance classification results as well as entity extraction and disease-related event recognition results are significantly better compared to the results reported by other animal disease surveillance systems. en_US
dc.description.sponsorship National Agriculture Biosecurity Center en_US
dc.language.iso en_US en_US
dc.publisher Kansas State University en
dc.subject entity extraction en_US
dc.subject event recognition and classification en_US
dc.subject web mining en_US
dc.subject document classification en_US
dc.subject named entity recognition en_US
dc.title Entity extraction, animal disease-related event recognition and classification from web en_US
dc.type Thesis en_US
dc.description.degree Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computing and Information Sciences en_US
dc.description.advisor William H. Hsu en_US
dc.subject.umi Computer Science (0984) en_US
dc.date.published 2010 en_US
dc.date.graduationmonth August en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx

Advanced Search


My Account


Center for the

Advancement of Digital


118 Hale Library

Manhattan KS 66506

(785) 532-7444