CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Sorted-L1 norm for outliers detection and high-dimensional robust regression: Sharp oracle inequalities and FDR control Authors:  Stephane Gaiffas - Universite Paris-Diderot (France)
Alain Virouleau - Ecole polytechnique (France) [presenting]
Agathe Guilloux - Universite Evry (France)
Malgorzata Bogdan - University of Wroclaw (Poland)
Abstract: The problems of outlier detection and robust regression in a high-dimensional setting are fundamental in statistics, and have numerous applications. Following a recent set of works providing methods for simultaneous robust regression and outliers detection, we consider a model of linear regression with individual intercepts, in a high-dimensional setting. Each individual intercept, if non-zero, corresponds to an individual shift from the linear model. In this setting, we introduce a new procedure for simultaneous estimation of the linear regression coefficients and intercepts, using two dedicated sorted-L1 penalizations, also called SLOPE. We develop a complete theory for this problem: first, we provide sharp oracle inequalities on the statistical estimation error of both the vector of individual intercepts and regression coefficients.Second, we give an asymptotic control on the False Discovery Rate (FDR) and statistical power for support selection of the intercepts, namely for the problem of outliers detection, obtained through our method. It is noteworthy that this is the first attempt to introduce a procedure for this problem with guaranteed FDR and statistical power control. Numerical illustrations, with a comparison to recent alternative approaches, are provided on both simulated datasets and several real-world datasets.