Bayesian model selection consistency for high-dimensional regression

Hua, Min

Bayesian model selection consistency for high-dimensional regression

dc.contributor.author	Hua, Min
dc.date.accessioned	2022-06-08T20:15:08Z
dc.date.available	2022-06-08T20:15:08Z
dc.date.graduationmonth	August
dc.date.issued	2022
dc.description.abstract	Bayesian model selection has enjoyed considerable prominence in high-dimensional variable selection in recent years. Despite its popularity, the asymptotic theory for high-dimensional variable selection has not been fully explored yet. In this study, we aim to identify prior conditions for Bayesian model selection consistency under high-dimensional regression settings. In a Bayesian framework, posterior model probabilities can be used to quantify the importance of models given the observed data. Hence, our focus is on the asymptotic behavior of posterior model probabilities when the number of the potential predictors grows with the sample size. This dissertation contains the following three projects. In the first project, we investigate the asymptotic behavior of posterior model probabilities under the Zellner's g-prior, which is one of the most popular choices for model selection in Bayesian linear regression. We establish a simple and intuitive condition of the Zellner's g-prior under which the posterior model distribution tends to be concentrated at the true model as the sample size increases even if the number of predictors grows much faster than the sample size does. Simulation study results indicate that the satisfaction of our condition is essential for the success of Bayesian high-dimensional variable selection under the g-prior. In the second project, we extend our framework to a general class of priors. The most pressing challenge in our generalization is that the marginal likelihood cannot be expressed in a closed form. To address this problem, we develop a general form of Laplace approximation under a high-dimensional setting. As a result, we establish general sufficient conditions for high-dimensional Bayesian model selection consistency. Our simulation study and real data analysis demonstrate that the proposed condition allows us to identify the true data generating model consistently. In the last project, we extend our framework to Bayesian generalized linear regression models. The distinctive feature of our proposed framework is that we do not impose any specific form of data distribution. In this project we develop a general condition under which the true model tends to maximize the marginal likelihood even when the number of predictors increases faster than the sample size. Our condition provides useful guidelines for the specification of priors including hyperparameter selection. Our simulation study demonstrates the validity of the proposed condition for Bayesian model selection consistency with non-Gaussian data.
dc.description.advisor	Gyuhyeong Goh
dc.description.degree	Doctor of Philosophy
dc.description.department	Department of Statistics
dc.description.level	Doctoral
dc.identifier.uri	https://hdl.handle.net/2097/42254
dc.language.iso	en_US
dc.publisher	Kansas State University
dc.rights	© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject	Bayesian model selection
dc.subject	High-dimensional regression
dc.subject	Posterior model probability consistency
dc.title	Bayesian model selection consistency for high-dimensional regression
dc.type	Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: MinHua2022.pdf
Size:: 864.3 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.62 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -