CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1698
Title: Application of supersaturated design-based statistical methods on observational data for variable selection Authors:  Tharkeshi Dharmaratne - RMIT University (Australia) [presenting]
Alysha De Livera - La Trobe University (Australia)
Stelios Georgiou - RMIT University (Australia)
Stella Stylianou - RMIT University (Australia)
Abstract: In experimental studies, factor screening can be performed using supersaturated screening designs (SSD)-based statistical methods when the number of factors exceeds the run size. Simulation studies have shown these SSD methods to be performing well in some experimental settings. Also, many of these methods are either test-based, penalty-based, or modifications of the statistical methods introduced for observational data. Therefore, in a novelty approach, it is motivated to explore the use of the contrast-based SSD methods on observational data for variable selection. The variable selection approach selects factors for model building in observational studies and it is widely performed using data-driven methods, which have often been criticised due to model uncertainty. As a remedy, in the case of the application of a data-driven method on a real-life dataset, it is recommended to apply the method on resample data and assess the model stability (robustness of the selected model once slight changes are applied to the dataset) using resampling-based measures. Therefore, initially, two contrast-based SSD-based statistical screening methods were modified and applied to a real-life dataset, which is commonly used in methodological observational studies. The variable selection performance of these methods was then compared with existing variable selection methods in observational studies using resampling-based measures.