COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0515
Title: Comparative analysis of LDA model selection criteria based on Monte Carlo simulations Authors:  Victor Bystrov - University of Lodz (Poland) [presenting]
Viktoriia Naboka - Justus Liebig Unversity of Giessen (Germany)
Anna Staszewska-Bystrova - University of Lodz (Poland)
Peter Winker - University of Giessen (Germany)
Abstract: The performance of the recently developed singular Bayesian information criterion (sBIC) in selecting the number of topics in LDA models is evaluated and compared to the performance of alternative model selection criteria proposed for topic models. The sBIC is a generalization of the standard BIC that can be implemented in singular statistical models. The comparison is based on Monte Carlo simulations and carried out for several alternative settings, varying with respect to the number of topics, the number of documents and the size of documents in the corpora. Practical recommendations for LDA model selection are developed.