LDA based approach for predicting friendship links in live journal social network

dc.contributor.authorParimi, Rohit
dc.date.accessioned2010-08-12T16:30:42Z
dc.date.available2010-08-12T16:30:42Z
dc.date.graduationmonthAugusten_US
dc.date.issued2010-08-12T16:30:42Z
dc.date.published2010en_US
dc.description.abstractThe idea of socializing with other people of different backgrounds and cultures excites the web surfers. Today, there are hundreds of Social Networking sites on the web with millions of users connected with relationships such as "friend", "follow", "fan", forming a huge graph structure. The amount of data associated with the users in these Social Networking sites has resulted in opportunities for interesting data mining problems including friendship link and interest predictions, tag recommendations among others. In this work, we consider the friendship link prediction problem and study a topic modeling approach to this problem. Topic models are among the most effective approaches to latent topic analysis and mining of text data. In particular, Probabilistic Topic models are based upon the idea that documents can be seen as mixtures of topics and topics can be seen as mixtures of words. Latent Dirichlet Allocation (LDA) is one such probabilistic model which is generative in nature and is used for collections of discrete data such as text corpora. For our link prediction problem, users in the dataset are treated as "documents" and their interests as the document contents. The topic probabilities obtained by modeling users and interests using LDA provide an explicit representation for each user. User pairs are treated as examples and are represented using a feature vector constructed from the topic probabilities obtained with LDA. This vector will only capture information contained in the interests expressed by the users. Another important source of information that is relevant to the link prediction task is given by the graph structure of the social network. Our assumption is that a user "A" might be a friend of user "B" if a) users "A" and "B" have common or similar interests b) users "A" and "B" have some common friends. While capturing similarity between interests is taken care by the topic modeling technique, we use the graph structure to find common friends. In the past, the graph structure underlying the network has proven to be a trustworthy source of information for predicting friendship links. We present a comparison of predictions from feature sets constructed using topic probabilities and the link graph separately, with a feature set constructed using both topic probabilities and link graph.en_US
dc.description.advisorDoina Carageaen_US
dc.description.degreeMaster of Scienceen_US
dc.description.departmentDepartment of Computing and Information Sciencesen_US
dc.description.levelMastersen_US
dc.description.sponsorshipNational Science Foundationen_US
dc.identifier.urihttp://hdl.handle.net/2097/4624
dc.language.isoen_USen_US
dc.publisherKansas State Universityen
dc.subjectSocial Network Analysisen_US
dc.subjectTopic Modelingen_US
dc.subjectFriendship Link Predictionen_US
dc.subject.umiComputer Science (0984)en_US
dc.titleLDA based approach for predicting friendship links in live journal social networken_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
RohitParimi2010.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: