Title: Weeding out early false discoveries along the lasso path via knockoffs
Authors: Malgorzata Bogdan - University of Wroclaw (Poland) [presenting]
Weijie Su - University of Pennsylvania (United States)
Asaf Weinstein - Stanford University (United States)
Emmanuel Candes - Stanford (United States)
Abstract: LASSO is one of the most popular methods for identifying predictors in large data bases. This happens despite the fact that in practical applications LASSO often returns many false discoveries (i.e. variables which in fact are not correlated with the response). Recently this fact has been theoretically described using the framework of linear sparsity regime of Approximate Message Passing Theory for Gaussian designs. Specifically, a precise trade-off between the power and the false discovery rate has been provided, which holds independently of the signal magnitude and can not be broken for any value of the tuning parameter. We will show that this limitation can be removed by thresholding the solution of LASSO. The appropriate threshold can be obtained using the knock-off methodology, which allows us to control the False Discovery Rate. We will present theoretical results showing that this approach allows us to break through the FDR-Power Diagram for any given value of the tuning parameter. We will also empirically demonstrate that selecting the tuning parameter by cross-validation allows us to obtain a roughly optimal power for a given FDR level.