Ontology engineering and feature construction for predicting friendship links and users interests in the Live Journal social network

Date

2008-10-15T14:48:29Z

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

An ontology can be seen as an explicit description of the concepts and relationships that exist in a domain. In this thesis, we address the problem of building an interests' ontology and using the same to construct features for predicting both potential friendship relations between users in the social network Live Journal, and users' interests. Previous work has shown that the accuracy of predicting friendship links in this network is very low if simply interests common to two users are used as features and no network graph features are considered. Thus, our goal is to organize users' interests into an ontology (specifically, a concept hierarchy) and to use the semantics captured by this ontology to improve the performance of learning algorithms at the task of predicting if two users can be friends. To achieve this goal, we have designed and implemented a hybrid clustering algorithm, which combines hierarchical agglomerative and divisive clustering paradigms, and automatically builds the interests' ontology. We have explored the use of this ontology to construct interest-based features and shown that the resulting features improve the performance of various classifiers for predicting friendships in the Live Journal social network. We have also shown that using the interests' ontology, one can address the problem of predicting the interests of Live Journal users, a task that in absence of the ontology is not feasible otherwise as there is an overwhelming number of interests.

Description

Keywords

Social network analysis, Interest ontology, Clustering, Machine learning, Friendship link prediction, Interest prediction

Graduation Month

December

Degree

Master of Science

Department

Department of Computing and Information Sciences

Major Professor

Doina Caragea; William H. Hsu

Date

2008

Type

Thesis

Citation