View Submission - HiTECCoDES2023
A0168
Title: Common topic identification in online Maltese news portal comments Authors:  Fiona Sammut - University of Malta (Malta) [presenting]
David Paul Suda - University of Malta (Malta)
Samuel Zammit - University of Malta (Malta)
Abstract: The aim is to identify common topics in a dataset of online news portal comments made between April 2008 and January 2017 on the Times of Malta website. Using the FastText algorithm, Word2Vec obtains word embeddings for each unique word in the dataset. Furthermore, document vectors are also obtained for each comment, where similar comments are assigned similar representations. The resulting word and document embeddings are clustered using k-means clustering to identify common topic clusters. The results obtained indicate that the majority of comments follow a political theme related either to party politics, foreign politics, corruption, issues of an ideological nature, or other issues. Comments related to themes such as sports, arts and culture were uncommon, except around years with significant events. Additionally, several topics were identified as more prevalent during some periods than others. These include the Maltese divorce referendum in 2011, the Maltese citizenship scheme in 2013, Russia's annexation of Crimea in 2014, Brexit in 2015 and the Panama Papers in 2016.