CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: PhyClone: A forest structured Chinese restaurant process for inferring tumour phylogenies Authors:  Andrew Roth - University of Oxford (United Kingdom) [presenting]
Alexandre Bouchard - University of British Columbia (Canada)
Abstract: Cancer is a disease caused by the ongoing accumulation of genomic alterations. Once a cancer has formed this process generates heterogeneity within the cancer cell population. We consider the problem of inferring the phylogenetic tree relating cancer cell populations from patient tumours using bulk genome sequencing data. Unlike data in traditional phylogenetic problems, bulk sequencing data measures an admixture of taxa in the tree. Furthermore, taxa from all nodes in the tree, not just leafs, may be represented. In previous work we have shown how clustering mutations which occur at the same cellular prevalence can be used to identify cancer cell populations. We extend that approach to also infer the underlying tree relationship among populations. We develop a non-parametric Bayesian prior over the clustering of the data and tree structure relating clusters. Posterior inference using Markov chain Monte Carlo (MCMC) sampling is performed using an auxiliary variable scheme and conditional Sequential Monte Carlo (cSMC) sampling. One advantage of this model is that it is possible to marginalize the parameters associated with the nodes of the trees which represent the cellular prevalence of mutations. Collapsing these variables can potentially improve the mixing of the MCMC sampler. We compare this model to previous work which used the tree structured stick breaking process, another non-parametric Bayesian prior over tree topologies and clusterings.