Is Seeing Believing? A Practitioner’s Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies

dc.citation.doi10.3390/e26090794
dc.citation.issn1099-4300
dc.citation.issue9
dc.citation.jtitleEntropy
dc.citation.volume26
dc.contributor.authorFan, Kun
dc.contributor.authorSubedi, Srijana
dc.contributor.authorYang, Gongshun
dc.contributor.authorLu, Xi
dc.contributor.authorRen, Jie
dc.contributor.authorWu, Cen
dc.date.accessioned2024-09-20T21:14:15Z
dc.date.available2024-09-20T21:14:15Z
dc.date.issued2024-09-16
dc.date.published2024
dc.description.abstractVariable selection methods have been extensively developed for and applied to cancer genomics data to identify important omics features associated with complex disease traits, including cancer outcomes. However, the reliability and reproducibility of the findings are in question if valid inferential procedures are not available to quantify the uncertainty of the findings. In this article, we provide a gentle but systematic review of high-dimensional frequentist and Bayesian inferential tools under sparse models which can yield uncertainty quantification measures, including confidence (or Bayesian credible) intervals, p values and false discovery rates (FDR). Connections in high-dimensional inferences between the two realms have been fully exploited under the “unpenalized loss function + penalty term” formulation for regularization methods and the “likelihood function × shrinkage prior” framework for regularized Bayesian analysis. In particular, we advocate for robust Bayesian variable selection in cancer genomics studies due to its ability to accommodate disease heterogeneity in the form of heavy-tailed errors and structured sparsity while providing valid statistical inference. The numerical results show that robust Bayesian analysis incorporating exact sparsity has yielded not only superior estimation and identification results but also valid Bayesian credible intervals under nominal coverage probabilities compared with alternative methods, especially in the presence of heavy-tailed model errors and outliers.
dc.identifier.urihttps://hdl.handle.net/2097/44633
dc.relation.urihttps://doi.org/10.3390/e26090794
dc.rightsCC BY 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectExact sparsity
dc.subjectFrequentist and Bayesian variable selection
dc.subjectRegularized variable selection
dc.subjectRobust Bayesian inference
dc.subjectUncertainty quantification
dc.titleIs Seeing Believing? A Practitioner’s Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies
dc.typeText

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
entropy_26_00794.pdf
Size:
4.25 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: