Title: Breaking the lasso power-FDR tradeoff diagram by thresholding
Authors: Asaf Weinstein - Stanford University (United States) [presenting]
Weijie Su - The Wharton School, University of Pennsylvania (United States)
Malgorzata Bogdan - University of Wroclaw (Poland)
Emmanuel Candes - Stanford (United States)
Abstract: The lasso is often used by practitioners as a variable selector in large regression problems. Most commonly, the penalty parameter is chosen by cross validation, even though it is well known that this method tends to yield too many false discoveries. Furthermore, recent work has shown that if $\lambda$ is set so that the false discovery rate is controlled, there is still an inherent cost in power (identification of true nonnulls), adding to the criticism of using the lasso for support estimation. It is also well known among practitioners that this phenomenon can be mitigated by thresholding the lasso estimate (at a value larger than zero). Working with IID Gaussian covariates, we analyze and precisely quantify the advantages that such a procedure can have in terms of the tradeoff between the false discovery proportion and the true positive proportion. Importantly, the penalty parameter $\lambda$ now plays a crucial role in the ordering of the variables, and, interestingly, we explain why cross-validation is the right way to choose it (at least in the IID Gaussian covariates case).