A hybrid approach to clustering optimization

dc.contributor.authorHayes, Dustin
dc.date.accessioned2024-01-29T21:28:03Z
dc.date.available2024-01-29T21:28:03Z
dc.date.graduationmonthAugust
dc.date.issued2024
dc.description.abstractA known limitation of many clustering algorithms is their inability to guarantee convergence to global optimality. The K-means clustering algorithm, as a representative example, given some data x and some objective function L which we aim to optimize, only ensures that clustering assignments are found such that L is optimized locally. Consequently, one can not generally know after one application of the K-means clustering algorithm if the globally optimal or an inferior locally optimal solution was found. This limitation is not exclusive to the K-means algorithm; other algorithms, such as the EM algorithm for Gaussian mixture models, are also liable to converge to a sub-optimal solution. In practice, one can somewhat avoid this limitation by running their clustering algorithm of choice a number of times and selecting the best solution found. However, this approach is uncertain by nature. At no point would the practitioner know whether or not the optimal solution had been found or if more attempts were required. A single algorithm which offers a higher degree of assurance that the global optimum was reached would be preferable. In this paper we explore existing clustering methods, discuss their limitations, and present a novel clustering algorithm, with implementation in R, that achieves a higher assurance of global optimization than traditional clustering methods. By strategically passing the optimization problem between an EM algorithm and a Gibbs sampler, our algorithm takes advantage of both local optimization and global search methods to explore the complex loss landscape of the objective function, ensuring more reliable convergence to optimal solutions.
dc.description.advisorGyuhyeong Goh
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Statistics
dc.description.levelMasters
dc.identifier.urihttps://hdl.handle.net/2097/44123
dc.language.isoen_US
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectClustering
dc.subjectOptimization
dc.subjectBayesian
dc.subjectMachine learning
dc.titleA hybrid approach to clustering optimization
dc.typeReport

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DustinHayes2024.pdf
Size:
529.25 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: