CMStatistics 2022: Start Registration
View Submission - CMStatistics
Title: Robust and accurate estimation of cellular fractions from tissue omics data via ensemble deconvolution Authors:  Jiebiao Wang - University of Pittsburgh (United States) [presenting]
Christopher McKennan - University of Pittsburgh (United States)
Manqi Cai - University of Pittsburgh (United States)
Wei Chen - University of Pittsburgh (United States)
Abstract: Tissue-level omics data such as transcriptomics and epigenomics are average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts at ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from dozens of single deconvolution methods, reference datasets, marker gene selection procedures, data normalizations, and transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissue types. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust, and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses, such as differential fractions associated with clinical variables. To increase generalizability, we further extended EnsDeconv to analyze bulk DNA methylation data.