Email and phone number entity search and ranking

dc.contributor.authorHao, Shuang
dc.date.accessioned2008-12-01T20:41:17Z
dc.date.available2008-12-01T20:41:17Z
dc.date.graduationmonthDecember
dc.date.issued2008-12-01T20:41:17Z
dc.date.published2008
dc.description.abstractEntity search has been proposed as a search method for domain-specific Internet applications. It differs from the classical approaches used by search engines which give a "page-view result": listing the URLs of web pages containing the desired keywords. Entity search returns more structured results listing the specific information that a user seeks, such as an email address or a phone number. It not only provides the URL links to targets, but also attributes of target entities (e.g., email address, phone number, etc.). Compared to classical search methods, entity search is a more direct and user-friendly method for searching through a large volume of web documents. After the user submits a query, the extracted entities are ordered by their relevance to the query. While previous work has proposed various complex formulas for entity ranking, it has not been shown whether such complexity is needed. In this research I explore the problem of whether a simpler method can achieve reasonable results. I have designed an entity-search and ranking algorithm using a formula that simply combines a page’s PageRank and an entity's distance to the query keywords to produce a metric for ranking discovered entities. My research goal is to answer the question of whether effective entity ranking can be performed by an algorithm that computes matching scores specific to the entity search domain, and what improvements are necessary to refine the result. My approach takes into account the entity's proximity to the keywords in the query as well as the quality of the page where it is contained. I implemented a system based on the algorithm and perform experiments to show that in most cases the result is consistent with the user's desired outcome.
dc.description.advisorWilliam H. Hsu
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Computing and Information Sciences
dc.description.levelMasters
dc.identifier.urihttp://hdl.handle.net/2097/1019
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectEntity Search
dc.subjectEntity Ranking
dc.subject.umiComputer Science (0984)
dc.titleEmail and phone number entity search and ranking
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ShuangHao2008.pdf
Size:
1.11 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.68 KB
Format:
Item-specific license agreed upon to submission
Description: