CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: BASiCS: Vertical and horizontal data integration for noisy single-cell expression data Authors:  Catalina Vallejos - MRC Human Genetics Unit (United Kingdom) [presenting]
Abstract: Single-cell RNA-sequencing (scRNA-seq) has transformed the field of transcriptomics, providing novel insights that were not accessible to bulk-level experiments. However, the promise of scRNA-seq comes at the cost of higher data complexity. In particular, a prominent feature of scRNA-seq experiments is strong measurement error, reflected by technical dropouts and poor correlations between technical replicates. These effects must be taken into account to reveal biological findings that are not confounded by technical variation. We will describe some statistical challenges that arise when analyzing scRNA-seq datasets. We will also introduce BASiCS (Bayesian Analysis of Single Cell Sequencing data), a Bayesian hierarchical model in which data normalization, noise quantification and downstream analyses are simultaneously performed. BASiCS exploits experimental design to disentangle biological signal from technical artifacts. This includes: (i) a vertical integration approach, where a set of technical spike-in genes is used as a gold-standard and (ii) a horizontal integration framework, where technical variation is quantified by borrowing information from multiple groups of samples. Using control experiments and case studies, we will illustrate how BASiCS goes beyond traditional differential expression analyses, identifying changes in cell-to-cell gene expression variability between pre-specified groups of cells.