Robust variable selection with stability selection

dc.contributor.authorYang, Gongshun
dc.date.accessioned2024-04-15T19:07:09Z
dc.date.available2024-04-15T19:07:09Z
dc.date.graduationmonthMay
dc.date.issued2024
dc.description.abstractHeterogeneity is the hallmark of cancer. In the presence of high-dimensional cancer genomics features, robust variable selection is critical to accommodate the heavy-tailed distributions and outliers rising due to cancer heterogeneity when selecting important omics features associated with the disease phenotype. However, it is challenging to appropriately choose the tuning parameters that can maximize the performance of the regularization approaches given the data heterogeneity. In published studies, assisted tuning has been proposed as an effective strategy to conduct regularized variable selection by overcoming the obstacle of selecting the optimal tuning parameters for non-robust variable selection methods such as LASSO. Nevertheless, we have shown in this study that the heavy-duty nature of this strategy makes it infeasible for omics feature selection when the data heterogeneity exists. Such a limitation has motivated us to develop the robust variable selection with stability selection that significantly outperforms a myriad of variable selection methods. In extensive simulation studies, we have demonstrated the advantage of the proposed method over alternative methods in terms of multiple criteria that are widely used in literature to determine the performance of machine learning methods. Furthermore, all the methods under comparison have been applied to the breast cancer and skin cancer data from TCGA. The proposed method has also shown superior performances in the case studies.
dc.description.advisorCen Wu
dc.description.degreeMaster of Science
dc.description.departmentDepartment of Statistics
dc.description.levelMasters
dc.identifier.urihttps://hdl.handle.net/2097/44309
dc.publisherKansas State University
dc.rights© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subjectRobust
dc.subjectHigh-dimensional data
dc.subjectVariable selection
dc.titleRobust variable selection with stability selection
dc.typeReport

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GongshunYang2024.pdf
Size:
632.3 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.6 KB
Format:
Item-specific license agreed upon to submission
Description: