Title: Accelerated continuous space topic model for textual data
Authors: Manabu Asai - Soka University (Japan) [presenting]
Seiichi Inoue - Soka University (Japan)
Abstract: For natural language processing, the discrete infinite logistic normal (DILN) distribution has been developed. The advantages of using Gaussian processes (GP), as in the DILN model, have been discussed. They claim that latent Gaussian noises used for stochastic kernel function in the DILN can be interpreted as a product of coordinates of words in a continuous space. They also pointed out that DILN model can be extended by accommodating additional process such as the Pitman-Yor process. For this generality, the variants of the DILN model, as the `continuous space topic model' (CSTM), have been considered. The purpose is to improve the CSTM family using the information of semantics and style of words in the Japanese language. Our empirical result shows that the new model outperforms the existing models regarding the perplexity measure.