Study on the performance of ontology based approaches to link prediction in social networks as the number of users increases

Date

2010-12-14

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

Recent advances in social network applications have resulted in millions of users joining such networks in the last few years. User data collected from social networks can be used for various data mining problems such as interest recommendations, friendship recommendations and many more. Social networks, in general, can be seen as a huge directed network graph representing users of the network (together with their information, e.g., user interests) and their interactions (also known as friendship links). Previous work [Hsu et al., 2007] on friendship link prediction has shown that graph features contain important predictive information. Furthermore, it has been shown that user interests can be used to improve link predictions, if they are organized into an explicitly or implicitly ontology [Haridas, 2009; Parimi, 2010]. However, the above mentioned previous studies have been performed using a small set of users in the social network LiveJournal. The goal of this work is to study the performance of the ontology based approach proposed in [Haridas, 2009], when number of users in the dataset is increased. More precisely, we study the performance of the approach in terms of performance for data sets consisting of 1000, 2000, 3000 and 4000 users. Our results show that the performance generally increases with the number of users. However, the problem becomes quickly intractable from a computation time point of view. As a part of our study, we also compare our results obtained using the ontology-based approach [Haridas, 2009] with results obtained with the LDA based approach in [Parimi, 2010], when such results are available.

Description

Keywords

Ontology, Social Networks, Link Prediction, Data Mining Problems, Large dataset, Study Performance

Graduation Month

December

Degree

Master of Science

Department

Department of Computing and Information Sciences

Major Professor

Doina Caragea

Date

2010

Type

Thesis

Citation