Statistical mechanics approaches to high-dimensional survival analysis

Date

2022-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

With the advent of high-dimensional data, variable selection has become a key step in survival data analysis. Recently, a general class of model selection criteria for high-dimensional data, called the generalized information criterion, has been developed. However, the use of the non-convex penalty functions in the generalized information criterion results in high-dimensional non-convex optimization problems. While many works have been proposed, their focus is limited to the application of a convex surrogate approach, which cannot ensure the convergence to the global optimal model with respect to the generalized information criterion. The objective of this dissertation is to develop new solutions to high-dimensional data challenges of survival analysis. To meet this goal, we develop a powerful framework for high-dimensional survival data analysis using the notion of statistical mechanics, which is one of the pillars of modern physics. The proposed methods in this dissertation are widely applicable to not only model fitting problems but also prediction problems. To investigate the performance of our proposed methods, simulation study and real data analysis are extensively implemented. In Chapter 1, the background, existing obstacles, rationale, and motivation are discussed. In Chapter 2, we develop a new fast variable selection procedure using the idea of simulated annealing with some modifications. The proposed method allows for rapidly finding the global optimal model with respect to the generalized information criterion. In Chapter 3, we develop a new best predictive model selection method for high-dimensional survival modeling. The proposed method relies on the idea of the optimal Bayesian predictive model, called the median probability model. In Chapter 4, we develop a robust variable selection approach to high-dimensional survival regression models. It is motivated by the "sandwich" estimator and provides a way for finding the global optimal model when the model is misspecified.

Description

Keywords

Survival analysis, High-dimensional variable selection, Generalized information criterion, Statistical mechanics, Boltzmann distribution

Graduation Month

May

Degree

Doctor of Philosophy

Department

Department of Statistics

Major Professor

Gyuhyeong Goh

Date

2022

Type

Dissertation

Citation