CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Invariant coordinate selection for outlier detection with application to quality control Authors:  Anne Ruiz-Gazen - Toulouse School of Economics (France) [presenting]
Aurore Archimbaud - Toulouse School of Economics (France)
Klaus Nordhausen - Vienna University of Technology (Austria)
Abstract: Detecting outliers in multivariate data sets is of particular interest in various contexts including quality control in high standards fields such as automotive or avionics. Some classical detection methods are based on the Mahalanobis distance or on robust Principal Component Analysis (PCA). One advantage of the Mahalanobis distance is its affine invariance while PCA is only invariant under orthogonal transformations. For its part, PCA allows some components selection and facilitates the interpretation of the detected outliers. We propose an alternative in a casewise contamination context and when the number of observations is larger than the number of variables, called invariant coordinate selection. Its principle is quite similar in spirit to PCA with invariant components derived from an eigendecomposition followed by a projection of the data on some selected eigenvectors. The decomposition is based on two scatter matrix estimators instead of one for PCA. While principal components are scale dependent, the invariant components are affine invariant for affine equivariant scatter matrices. Moreover, under some elliptical mixture models, the Fisher's linear discriminant subspace coincides with a subset of invariant components in the case where group identifications are unknown. The method will be illustrated on several data sets from the quality control field. The problem of multicollinearity and singular scatter matrices will be also advocated.