Workshop SDS: Registration
View Submission - SDS2022
Title: Text based innovation indicators a progress report Authors:  Peter Winker - University of Giessen (Germany) [presenting]
David Lenz - Justus-Liebig University Giessen (Germany)
Albina Latifi - Justus Liebig University Giessen (Germany)
Abstract: There exist many indicators for innovative activity. The projects TOBI and DynTOBI aim at developing novel text-based indicators based on the information provided in websites of firms and on news articles from a technology-related online news provider. The presentation will focus on the latter dataset, which allows describing innovation diffusion over time. It is described which steps are required to transform the raw textual data into time series which might reflect the diffusion of new products or technologies. While the presented results will be mainly explorative, the approach might be developed further for prediction purposes. The first step of the analysis consists in applying computational methods from natural language processing to identify latent topics in the text corpus and to obtain associated time series of topic weights. Furthermore, a labeling of innovation topics is performed by experts. In a second step, methods from functional data analysis (FDA) are applied to categorize these time series in clusters. For this purpose, an implementation of the global search heuristic Threshold Accepting (TA) is applied, which appears provides better and more robust results compared to standard sequential techniques such as k-means. The identified clusters of prototypical innovation diffusion trends show some variability as compared to the standard textbook shape. Moreover, the approach allows to uncover different stages of innovation diffusion.