Machine Learning for High Performance Computing Applications

dc.contributor.authorHutchison, Scott
dc.date.accessioned2024-04-15T19:01:52Z
dc.date.available2024-04-15T19:01:52Z
dc.date.graduationmonthMay
dc.date.published2024
dc.description.abstractThe focus of this study was to apply state-of-the-art Machine Learning (ML) techniques to problems in the High Performance Computing (HPC) domain. The ML techniques included clustering, various types of regression, a recommendor system, and reinforcement learning using proximal policy optimization. Included are three different advancements applying these techniques. The first application used K-means clustering and Gradient Boosted Tree Regression (GBTR) to predict estimated queue time for jobs submitted to an HPC system. This method achieved a 96% accuracy when predicting whether or not a job would start prior to a specified deadline. The second application focused on optimizing hardware procurement for HPC systems while remaining under a fixed budget. Vendor quotes for new hardware were used with a custom Discrete Event Simulator (DES) to simulate the execution of a job workload on proposed hardware. An Extreme Gradient Boosting (XGBoost) regression model powers a recommendor system that provides a precision@50 of 92%. The third application used Proximal Policy Optimization (PPO) with Invalid Action Masking (IAM) to train a Reinforcement Learning (RL) agent to schedule jobs on a simulated HPC system. The performance of this RL agent was compared to modern scheduling algorithms. The RL agent performed 18.44% better than the algorithmic baselines for one metric and comparably to the baselines for another.
dc.description.advisorDaniel A. Andresen
dc.description.degreeDoctor of Philosophy
dc.description.departmentDepartment of Computer Science
dc.description.levelDoctoral
dc.identifier.urihttps://hdl.handle.net/2097/44307
dc.language.isoen_US
dc.subjectHigh Performance Computing
dc.subjectMachine Learning
dc.subjectReinforcement Learning
dc.subjectRegression
dc.subjectArtificial Intellegence
dc.titleMachine Learning for High Performance Computing Applications
dc.typeDissertation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ScottHutchison2024.pdf
Size:
1.47 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: