Exploring knowledge bases for engineering a user interests hierarchy for social network applications

dc.contributor.authorHaridas, Mandar
dc.date.accessioned2009-06-23T19:52:42Z
dc.date.available2009-06-23T19:52:42Z
dc.date.graduationmonthAugusten
dc.date.issued2009-06-23T19:52:42Z
dc.date.published2009en
dc.description.abstractIn the recent years, social networks have become an integral part of our lives. Their outgrowth has resulted in opportunities for interesting data mining problems, such as interest or friendship recommendations. A global ontology over the interests specified by the users of a social network is essential for accurate recommendations. The focus of this work is on engineering such an interest ontology. In particular, given that the resulting ontology is meant to be used for data mining applications to social network problems, we explore only hierarchical ontologies. We propose, evaluate and compare three approaches to engineer an interest hierarchy. The proposed approaches make use of two popular knowledge bases, Wikipedia and Directory Mozilla, to extract interest definitions and/or relationships between interests. More precisely, the first approach uses Wikipedia to find interest definitions, the latent semantic analysis technique to measure the similarity between interests based on their definitions, and an agglomerative clustering algorithm to group similar interests into higher level concepts. The second approach uses the Wikipedia Category Graph to extract relationships between interests. Similarly, the third approach uses Directory Mozilla to extract relationships between interests. Our results indicate that the third approach, although the simplest, is the most effective for building an ontology over user interests. We use the ontology produced by the third approach to construct interest based features. These features are further used to learn classifiers for the friendship prediction task. The results show the usefulness of the ontology with respect to the results obtained in absence of the ontology.en
dc.description.advisorDoina Carageaen
dc.description.advisorGurdip Singhen
dc.description.degreeMaster of Scienceen
dc.description.departmentDepartment of Computing and Information Sciencesen
dc.description.levelMastersen
dc.identifier.urihttp://hdl.handle.net/2097/1528
dc.language.isoen_USen
dc.publisherKansas State Universityen
dc.subjectOntologyen
dc.subjectSocial Networksen
dc.subjectWikipediaen
dc.subjectDirectory Mozillaen
dc.subjectLatent Semantic Analysisen
dc.subject.umiComputer Science (0984)en
dc.titleExploring knowledge bases for engineering a user interests hierarchy for social network applicationsen
dc.typeThesisen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
MandarHaridas2009.pdf
Size:
478.43 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: