COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0494
Title: Efficient computation of robust multivariate maximum association Authors:  Pia Pfeiffer - TU Wien (Austria) [presenting]
Andreas Alfons - Erasmus University Rotterdam (Netherlands)
Peter Filzmoser - Vienna University of Technology (Austria)
Abstract: Methods to measure association between multivariate datasets become increasingly important as more multimodal data is acquired. Canonical Correlation Analysis (CCA) is widely applied for this task but is neither robust in the presence of atypical observations nor well-defined in the high-dimensional case when more variables than samples are collected. Let $R$ denote a bivariate measure of association. A measure of maximum association between two multivariate variables $X$ and $Y$ is defined via maximization of $R$ between linear combinations of sets of variables: $\rho = \max_{||\alpha|| = 1, ||\beta|| = 1} R(\alpha^T X, \beta^T Y)$. Using the Pearson correlation for the association measure $R$ results in the first canonical correlation coefficient, while a robust choice of $R$ yields a more robust estimator. These estimators have desirable theoretical properties, but computation can be a limiting factor: Methods that require the computation of covariance matrices, or are based on pairwise comparison, or grid-search do not scale well to high-dimensional data. We present an algorithm based on adaptive gradient descent and M-association derived from a bivariate M-scatter matrix for the computation of robust multivariate maximum association. Simulations illustrate the robustness properties of our approach, as well as its suitability for high-dimensional data. The presented algorithm can also be applied to other robust methods in the context of high-dimensional data analysis.