Title: Cellwise robust M estimation based on sparse outlyingness
Authors: Sven Serneels - BASF Corporation (United States) [presenting]
Tim Verdonck - UAntwerp, KU Leuven (Belgium)
Sebastiaan Hoppner - KU Leuven (Belgium)
Abstract: Robust statistical estimators have two major practical purposes: stable estimation in the presence of outliers and outlier detection. Outliers are considered to be entire cases in a sample, regardless of dimensionality. State-of-the-art outlier detection methods based on robust statistics therefore flag entire cases as outliers. However, as data dimensions increase, it becomes increasingly more likely that outliers only deviate with respect to a subset of the variables that constitute them. It has recently been shown that the direction of maximal oultyingness can be rewritten as a regression problem. By applying a variable selection technique to that associated regression problem, those variables that contribute most to a case's outlyingness, can be detected. This detection scheme can be iterated until convergence, stopping when the case is no longer outlying after removal of those variables that contributed most to its outlyingness. This information can be embedded into a robust estimation procedure. It is well known that M estimators can efficiently be implemented in an iterative re-weighting scheme. Given the information on individual variables' contribution to outlyingness, the iterative re-weighting scheme can be adapted to use cell specific weights instead of case weights. The estimator thus constructed, is a cellwise robust M estimator. A few examples of how this general idea can be implemented to specific estimators, will be shown.