Title: Bayesian modeling of metagenomic sequencing data for discovering microbial biomarkers in colorectal cancer
Authors: Shuang Jiang - Southern Methodist University (United States) [presenting]
Qiwei Li - The University of Texas at Dallas (United States)
Xiaowei Zhan - The University of Texas Southwestern Medical Center (United States)
Guanghua Xiao - University of Texas Southwestern Medical Center (United States)
Andrew Koh - The University of Texas Southwestern Medical Center (United States)
Abstract: Colorectal cancer (CRC) is a major cause of morbidity and mortality globally. Reductions in mortality can be achieved through the detection and treatment of early-stage CRC patients. Colonoscopy is currently the most effective CRC screening test in nowadays. However, it is costly, invasive, and requires anesthesia. A simple, non-invasive test with high accuracy for CRC is urgently needed. Several recent CRC studies have demonstrated a significant association between tumorigenesis and abnormalities in the microbial community. Those findings shed light on utilizing microbial taxa as noninvasive CRC biomarkers. We propose a Bayesian hierarchical framework to identify a set of differentially abundant taxa, which could potentially serve as microbial biomarkers. The bottom level is a multivariate count generative model that links the observed counts in each sample to their latent normalized abundances. The top-level is a Gaussian mixture model with a feature selection scheme for identifying those taxa whose normalized abundances are discriminatory between different phenotypes. The model further employs Markov random field priors to incorporate taxonomic tree information to identify microbial biomarkers at different taxonomic ranks. A CRC case study demonstrates that a resulting diagnostic model trained by the microbial signatures identified by our model in a CRC cohort can significantly improve the current predictive performance in another independent CRC cohort.