Link discovery in very large graphs by constructive induction using genetic programming

dc.contributor.authorWeninger, Timothy Edwards
dc.date.accessioned2008-12-18T22:34:12Z
dc.date.available2008-12-18T22:34:12Z
dc.date.graduationmonthDecember
dc.date.issued2008-12-18T22:34:12Z
dc.date.published2008
dc.description.abstractThis thesis discusses the background and methodologies necessary for constructing features in order to discover hidden links in relational data. Specifically, we consider the problems of predicting, classifying and annotating friends relations in friends networks, based upon features constructed from network structure and user profile data. I first document a data model for the blog service LiveJournal, and define a set of machine learning problems such as predicting existing links and estimating inter-pair distance. Next, I explain how the problem of classifying a user pair in a social networks, as directly connected or not, poses the problem of selecting and constructing relevant features. In order to construct these features, a genetic programming approach is used to construct multiple symbol trees with base features as their leaves; in this manner, the genetic program selects and constructs features that many not have been considered, but possess better predictive properties than the base features. In order to extract certain graph features from the relatively large social network, a new shortest path search algorithm is presented which computes and operates on a Euclidean embedding of the network. Finally, I present classification results and discuss the properties of the frequently constructed features in order to gain insight on hidden relations that exists in this domain.
dc.description.advisorWilliam H. Hsu
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Computing and Information Sciences
dc.description.levelMasters
dc.description.sponsorshipDefense Intelligence Agency; Department of Homeland Security; National Agriculture Biosecurity Center
dc.identifier.urihttp://hdl.handle.net/2097/1087
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectLink Mining
dc.subjectGraph Theory
dc.subjectMachine Learning
dc.subjectEvolutionary Computation
dc.subject.umiArtificial Intelligence (0800)
dc.subject.umiComputer Science (0984)
dc.titleLink discovery in very large graphs by constructive induction using genetic programming
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
TimothyWeninger2008.pdf
Size:
740.55 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.69 KB
Format:
Item-specific license agreed upon to submission
Description: