Title: Spying on the prior of the number of data clusters and the partition distribution in Bayesian cluster analysis
Authors: Jan Greve - WU Vienna University of Economics and Business (Austria) [presenting]
Bettina Gruen - WU (Vienna University of Economics and Business) (Austria)
Gertraud Malsiner-Walli - WU Vienna University of Economics and Business (Austria)
Sylvia Fruehwirth-Schnatter - WU Vienna University of Economics and Business (Austria)
Abstract: Recently in Bayesian Model-Based Clustering, the use of mixture models with an unknown number of clusters and/or components such as Dirichlet Process Mixtures (DPMs), Pitman-Yor Mixtures (PYMs) and Mixture of Finite Mixtures (MFMs) is getting increasingly common. A major empirical challenge involving these models is the characterisation of the prior on the partition space they each induce. An approach is introduced to compute descriptive statistics of the prior on the partitions for several influential mixture models employed in Bayesian Model-Based Clustering (specifically, DPMs and two classes of MFMs). The proposed methodology involves computationally efficient enumeration of the prior on the number of clusters and determining the first two prior moments of symmetric additive statistics characterising the partitions. The accompanying reference implementation is made available in the R package fipp. Finally, ongoing work to generalize this approach to a broader class of models is briefly mentioned.