Comparison of blocking and hierarchical ways to find cluster

Kumar, Swapnil

Comparison of blocking and hierarchical ways to find cluster

Files

SwapnilKumar2017.pdf (2.11 MB)

Date

2017-05-01

Authors

Kumar, Swapnil

Publisher

Kansas State University

Abstract

Clustering in data mining is a process of discovering groups in a set of data such that the similarity within the group is maximized and the similarity among the groups is minimized.

One way of approaching clustering is to treat it as a blocking problem of minimizing the maximum distance between any two units within the same group. This method is known as Threshold blocking. It works by applying blocking as a graph partition problem.

Chameleon is a hierarchical clustering algorithm, that based on dynamic modelling measures the similarity between two clusters. In the clustering process, to merge two cluster, we check if the inter-connectivity and closeness between two clusters are high relative to the internal inter-connectivity of the clusters and closeness of items within the clusters. This way of merging of cluster using the dynamic model helps in discovery of natural and homogeneous clusters.

The main goal of this project is to implement a local implementation of CHAMELEON and compare the output generated from Chameleon against Threshold blocking algorithm suggested by Higgins et al with its hybridized form and unhybridized form.

Keywords

Clustering, Hierarchical, Threshold blocking

Graduation Month

May

Degree

Master of Science

Department

Department of Computing and Information Sciences

Major Professor

William H. Hsu

Type

Report

URI

http://hdl.handle.net/2097/35425

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -

Full item page

Comparison of blocking and hierarchical ways to find cluster

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Graduation Month

Degree

Department

Major Professor

Date

Type

Citation

URI

Collections