Graph mining for role extraction in predictive analytics of high-performance computing systems

Date

2020-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis addresses the task of analyzing property graphs in system log data from high-performance computing (HPC) systems, to identify entity roles to aid in predicting job submission outcomes. This predictive analytics project uses inductive learning on historical logs to produce regression models for estimating resource needs and potential shortfalls, and classification models that predict when jobs will fail due to insufficient resource allocation. The log files are generated by the workload manager of an HPC compute cluster and include runtime parameters for every submitted job. The research objectives of the overall project consist of using these techniques to solve three extant problems: (1) predicting the sufficiency of resource requested in a HPC system at job submission time; (2) making HPC resource allocation more efficient; and (3) building a decision support system for HPC users. Previous approaches and techniques used features such as user demographics and simulations harnessed with simple optimization algorithms to improve the resource allocation usage on a large-scale compute cluster (Kansas State University’s Beocat). In this thesis, role extraction is applied with the goal to create a user-specific feature for machine learning tasks. Specific use cases include personalized prediction of submitted job outcomes or reinforcement learning from simulation for optimization tasks in job scheduling. Objectives include improving on the accuracy, precision, recall, and utility of previous learning systems.

Description

Keywords

Graph mining, Role extraction, High-performance computing systems, Property graphs, Predictive analytics

Graduation Month

May

Degree

Master of Science

Department

Department of Computer Science

Major Professor

William H. Hsu

Date

2020

Type

Thesis

Citation