CMStatistics 2020: Start Registration
View Submission - CMStatistics
Title: Posterior summaries of topic models: An example from grocery retail baskets Authors:  Ioanna Manolopoulou - University College London (United Kingdom) [presenting]
Mariflor Vega Carrasco - University College London (United Kingdom)
Mirco Musolesi - University College London (United Kingdom)
Abstract: Understanding the shopping motivations behind market baskets is an important goal in the grocery retail industry. Analyzing shopping transactions demands techniques that can cope with the volume and complicated dependencies of grocery transactional data, while keeping interpretable outcomes. Latent Dirichlet Allocation (LDA) provides a suitable framework to process grocery transactions and to discover a broad representation of customers' shopping motivations. However, summarising the posterior distribution of an LDA model is challenging, because LDA is inherently a mixture model and can exhibit substantial label-switching. Moreover, even when a posterior mean is computed, a summary of corresponding uncertainty is not straightforwardly available. We introduce a clustering methodology that post-processes posterior LDA draws to summarise the entire posterior distribution and identify semantic modes represented as recurrent topics. We illustrate our methods on an example from a large UK supermarket chain.