CMStatistics 2021: Start Registration
View Submission - CMStatistics
Title: Correlations at the margins cannot be rescued by estimation Authors:  Gregory Gloor - University of Western Ontario (Canada) [presenting]
Abstract: Count compositional data result from high throughput sequencing datasets generated by platforms that have an upper bound on the number of reads delivered such as the Illumina instruments. Correlation in compositional data is properly determined using ratio information, but recently it was used modelling to show that compositional association near the low count margin was unreliable. In this modelling the discrete nature of the data caused the ratio information to be undetermined, leading to the loss of a known association. We attempt to rescue these associations using several methods, including naive priors, imputation, amalgamation, and probabilistic modelling. Results show that low count associations cannot be recovered by any of these methods. These results further show that low count features in compositional data are unreliable and that findings that are dependent wholly or partly on low count features are suspect.