Title: Joint cohort and predictive modelling
Authors: Samuel Emerson - Durham University (United Kingdom) [presenting]
Louis Aslett - Durham University (United Kingdom)
Abstract: Bayesian logistic regression is a common classification model for binary response data, although in many circumstances the predictive performance and interpretability can be improved with multiple logistic regression models on some partitions of the data. These partitions represent natural clusters in the covariate space where the model may systematically differ. For example, in health modelling settings there may be natural patient cohorts, where interest would often lie in any differences each model reveals between cohorts. We propose a method to jointly find these cohorts and fit the classification model by constructing a graph in covariate space, which is explored by a scheme proposing cuts that form cohorts. A sequential Monte Carlo sampler for the model marginal enables efficient growth and shrinkage of cohorts, making alternatives to logistic regression an easy extension. We discuss associated computational challenges that may arise in large data settings and the work in progress to ameliorate these through principled approximations. There are links to related methods such as a mixture of expert models and model-based clustering.