CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Controlling FWER in Stepwise Regression Using Multiple Comparisons Authors:  Kory Johnson - University of Vienna (Austria) [presenting]
Abstract: Forward stepwise regression provides an approximation to the sparse feature selection problem and is used when the number of features is too large to manually search model space. In this setting, we desire a rule for stopping stepwise regression using hypotheses tests while controlling a notion of false rejections. That being said, forward stepwise regression is commonly considered to be ``data dredging" and not statistically sound. As the hypotheses tested by forward stepwise are determined by looking at the data, the resulting classical hypotheses tests are not valid. We present a simple solution which leverages classical multiple comparison methods in order to test the stepwise hypotheses using the max-t test proposal of \cite{BujaB14}. The resulting procedures are fast enough to be used in high-dimensional settings while controlling the family wise error rate. Other procedures estimate new p-values and perform selection while controlling FDR. While our error measure is more conservative, we achieve significantly higher power with massive gains in speed and simplicity.