EcoSta 2022: Start Registration
View Submission - EcoSta2022
A0171
Title: Resampling-based inferences for compositional regression when sample sizes are limited Authors:  Sujin Lee - Seoul National University (Korea, South)
Jeongyoun Ahn - University of Georgia (United States)
Sungkyu Jung - Seoul National University (Korea, South) [presenting]
Abstract: Gut microbiomes are increasingly found to be associated with many health-related characteristics of humans as well as animals. A regression with compositional microbiomes covariates is commonly used to identify important bacterial taxa that are related to various phenotype responses. Often the dimension of microbiome taxa easily exceeds the number of available samples, which creates a serious challenge in the estimation and inference of the model. We propose a new estimation and inference procedure for linear regression models with extremely low-sample sized compositional predictors. Under the compositional log-contrast regression framework, the proposed approach consists of two steps. The first step is to screen relevant predictors by fitting a log-contrast model with a sparse penalty. The screened-in variables are used as predictors in the non-sparse log-contrast model in the second step, where each of the regression coefficients is tested using nonparametric, resampling-based methods such as permutation and bootstrap. The performances of the proposed methods are evaluated by a simulation study, which shows they outperform traditional approaches based on normal assumptions or large sample asymptotics. Application to steer microbiomes data successfully identifies key bacterial taxa that are related to important cattle quality measures.