Title: Robust principal components for high-dimensional data
Authors: Stefan Van Aelst - University of Leuven (Belgium) [presenting]
Holger Cevallos Valdiviezo - Ghent University (Belgium)
Matias Salibian-Barrera - The University of British Columbia (Canada)
Abstract: Classical (functional) principal component analysis can be written as a least squares optimization and thus can be highly influenced by outliers. To reduce the influence of atypical measurements in the data, we propose two methods based on trimming: a multivariate least trimmed squares (LTS) estimator and a componentwise variant. The multivariate LTS minimizes the least squares criterion over subsets of observations. The componentwise version minimizes the sum of univariate LTS scale estimators in each of the components separately. The methods can directly be applied to high-dimensional multivariate data. Instead of LTS scales other robust scales such as S-scales can be considered as well. The methods can also be applied to functional data. In the case that the curves are irregularly spaced, a smoothing step can be applied to represent the curves in a high-dimensional space. The resulting solution is then mapped back onto the functional space. Outliers can be identified by examining their orthogonal distance from the subspace.