CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Quantification of the uncertainty of a partition coming from the Dirichlet process mixture model Authors:  Aurore Lavigne - University of Lille (France) [presenting]
Silvia Liverani - Queen Mary University of London (United Kingdom)
Abstract: Results on the quantification of the uncertainty link to partition obtained from a Dirichlet process mixture model (DPMM) are presented. This model is popular for model-based clustering under the Bayesian framework, and is used in numerous fields (machine learning, epidemiology, genetic). In the DPMM, the Dirichlet process is assigned as prior of the mixture distribution, that allows to not specify the expected number of mixture components. Moreover, numerous inference methods are now well established. However, the extraction of a unique partition from the partitions sampled in their posterior distributions is a sensitive task. Numerous methods are proposed, but in practice, they lead to partitions which may turn out to be very different, making the interpretation difficult. We propose a method to quantify the uncertainty of a partition regarded as ``optimal''. The approach is based on an analogy with finite mixture models. We break down the predictive distribution into a mixture of densities and the weights are proportional to the cluster size of the ``optimal'' partition. We show that the densities are not parametric and that they do not depend only on observations coming from the class they represent. Finally, we propose a diagram in order to show for each observation, its probability of being clustered in each class of the ``optimal'' partition.