Title: Robust regression on compositional variables including cell-wise outliers
Authors: Nikola Stefelova - Palacky University Olomouc (Czech Republic) [presenting]
Andreas Alfons - Erasmus University Rotterdam (Netherlands)
Javier Palarea-Albaladejo - Biomathematics and Statistics Scotland (United Kingdom)
Peter Filzmoser - Vienna University of Technology (Austria)
Karel Hron - Palacky University (Czech Republic)
Abstract: Multivariate data are commonly arranged as a rectangular matrix with observations or cases in the rows and variables in the columns. Ordinary robust estimators are designed to deal with case-wise outliers, assuming that most observations are free of contamination. However, this approach may lead to a significant loss of information in situations where outliers, not necessarily many, occur at the individual cell level but affect a large fraction of observations. Moreover, additional problems are confronted when data of compositional nature are involved. In this case, all the relevant information for statistical analysis is contained in the ratios between parts of the composition, columns of the data matrix, and then cell-wise contamination in these easily propagates throughout and distorts the results. The aim is to present a method for robust compositional regression that is able to deal with both case-wise and cell-wise outliers. In brief, outlying cells are firstly filtered out and replaced by sensible values. Then, robust compositional MM-regression is carried out on the replaced dataset. Imputation uncertainty is reflected on regression coefficient estimates via a multiple imputation (MI) scheme. The performance is assessed using simulated as well as real-world biological data.