LDA based approach for predicting friendship links in live journal social network

dc.contributor.authorParimi, Rohit
dc.date.accessioned2010-08-12T16:30:42Z
dc.date.available2010-08-12T16:30:42Z
dc.date.graduationmonthAugust
dc.date.issued2010-08-12T16:30:42Z
dc.date.published2010
dc.description.abstractThe idea of socializing with other people of different backgrounds and cultures excites the web surfers. Today, there are hundreds of Social Networking sites on the web with millions of users connected with relationships such as "friend", "follow", "fan", forming a huge graph structure. The amount of data associated with the users in these Social Networking sites has resulted in opportunities for interesting data mining problems including friendship link and interest predictions, tag recommendations among others. In this work, we consider the friendship link prediction problem and study a topic modeling approach to this problem. Topic models are among the most effective approaches to latent topic analysis and mining of text data. In particular, Probabilistic Topic models are based upon the idea that documents can be seen as mixtures of topics and topics can be seen as mixtures of words. Latent Dirichlet Allocation (LDA) is one such probabilistic model which is generative in nature and is used for collections of discrete data such as text corpora. For our link prediction problem, users in the dataset are treated as "documents" and their interests as the document contents. The topic probabilities obtained by modeling users and interests using LDA provide an explicit representation for each user. User pairs are treated as examples and are represented using a feature vector constructed from the topic probabilities obtained with LDA. This vector will only capture information contained in the interests expressed by the users. Another important source of information that is relevant to the link prediction task is given by the graph structure of the social network. Our assumption is that a user "A" might be a friend of user "B" if a) users "A" and "B" have common or similar interests b) users "A" and "B" have some common friends. While capturing similarity between interests is taken care by the topic modeling technique, we use the graph structure to find common friends. In the past, the graph structure underlying the network has proven to be a trustworthy source of information for predicting friendship links. We present a comparison of predictions from feature sets constructed using topic probabilities and the link graph separately, with a feature set constructed using both topic probabilities and link graph.
dc.description.advisorDoina Caragea
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Computing and Information Sciences
dc.description.levelMasters
dc.description.sponsorshipNational Science Foundation
dc.identifier.urihttp://hdl.handle.net/2097/4624
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectSocial Network Analysis
dc.subjectTopic Modeling
dc.subjectFriendship Link Prediction
dc.subject.umiComputer Science (0984)
dc.titleLDA based approach for predicting friendship links in live journal social network
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RohitParimi2010.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: