Title: On analyzing zero patterns in compositional data sets
Authors: Jose Antonio Martin-Fernandez - Universitat de Girona (Spain) [presenting]
Javier Palarea-Albaladejo - Biomathematics and Statistics Scotland (United Kingdom)
Abstract: Compositional Data (CoDa) are samples of random vectors representing parts of a whole, which only carry relative information. CoDa consist of vectors with strictly positive components whose sum is usually constant (e.g., 1, 100\%, 106). In some applications, CoDa sets include so-called essential zeros. That is, zeros corresponding with parts genuinely absent from the composition and not with some form of censoring. For example, this is usual in time use research where some individuals spend no time on a certain activity category. Essential zeros are troublesome because it is not generally realistic to replace them by small values. To investigate whether the patterns of zeros are associated to subpopulations in the data, the subgroups of samples defined by the pattern of zeros can be analyzed in terms of compositional location and variability measures obtained from common non-zero parts. Graphical and statistical tools are introduced in this work to explore and testing for differences between groups defined by zero patterns. In particular, parametric and permutation tests for log-ratio variances are presented. These tests are further generalized for the case of projections along log-contrasts of interest determined by the user. Their performance is illustrated through real and simulated data sets.