Density and partition based clustering on massive threshold bounded data sets

K-REx Repository

Show simple item record

dc.contributor.author Kannamareddy, Aruna Sai
dc.date.accessioned 2017-04-21T14:01:26Z
dc.date.available 2017-04-21T14:01:26Z
dc.date.issued 2017-05-01 en_US
dc.identifier.uri http://hdl.handle.net/2097/35467
dc.description.abstract The project explores the possibility of increasing efficiency in the clusters formed out of massive data sets which are formed using threshold blocking algorithm. Clusters thus formed are denser and qualitative. Clusters that are formed out of individual clustering algorithms alone, do not necessarily eliminate outliers and the clusters generated can be complex, or improperly distributed over the data set. The threshold blocking algorithm, a current research paper from Michael Higgins of Statistics Department on other hand, in comparison with existing algorithms performs better in forming the dense and distinctive units with predefined threshold. Developing a hybridized algorithm by implementing the existing clustering algorithms to re-cluster these units thus formed is part of this project. Clustering on the seeds thus formed from threshold blocking Algorithm, eases the task of clustering to the existing algorithm by eliminating the overhead of worrying about the outliers. Also, the clusters thus generated are more representative of the whole. Also, since the threshold blocking algorithm is proven to be fast and efficient, we now can predict a lot more decisions from large data sets in less time. Predicting the similar songs from Million Song Data Set using such a hybridized algorithm is considered as the data set for the evaluation of this goal. en_US
dc.language.iso en_US en_US
dc.publisher Kansas State University en
dc.subject Threshold blocking en_US
dc.subject Clustering en_US
dc.subject Kmeans en_US
dc.subject Dbscan en_US
dc.subject Hybrid cluster model en_US
dc.title Density and partition based clustering on massive threshold bounded data sets en_US
dc.type Report en_US
dc.description.degree Master of Science en_US
dc.description.level Masters en_US
dc.description.department Department of Computing and Information Sciences en_US
dc.description.advisor William H. Hsu en_US
dc.date.published 2017 en_US
dc.date.graduationmonth May en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


Browse

My Account

Statistics








Center for the

Advancement of Digital

Scholarship

cads@k-state.edu