A0689
Title: Joint sent/topic modelling: The non-exchangeability of topics trap and improved convergence diagnostics
Authors: Olivier Delmarcelle - Ghent University (Belgium) [presenting]
Kris Boudt - Vrije Universiteit Brussel and VU Amsterdam (Belgium)
David Ardia - HEC Montreal (Canada)
Abstract: The joint sentiment/topic model (JST) aims at opinion mining textual data by estimating in an unsupervised way jointly the topics and sentiment in a text, while allowing the sentiment classification to be conditional on the topic. We warn users of JST models against convergence issues due to the trap of non-exchangeability of sentiment/topics leading to a large number of modes polluting the MCMC inference. This pitfall becomes evident when expressing the JST model under the novel view of the Dirichlet-Tree distribution. We propose a coherence metric designed to evaluate the quality of the inferred models. Experiments on synthetic data conclude that this metric is better at estimating model quality than likelihood. Finally, we provide general guidelines on the usage of JST-class models and the tuning of their hyper-parameters.