CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1042
Title: Latent variable models and machine learning for prediction of employment status in Italy Authors:  Roberta Varriale - University of Rome La Sapienza (Italy) [presenting]
Marco Alfo - University La Sapienza, Rome (Italy)
Abstract: The increasing availability of a large amount of multi-source information in national statistical institutes makes it necessary to investigate new methodological approaches, based on combining primary and secondary data, for the production of estimates. Primary data are collected by NSIs for statistical purposes, usually using a statistical sample survey. Secondary data, such as administrative registers and big data, are not collected by NSIs, and are not collected for statistical purposes. Still, they may be used by NSIs for producing statistics. In the context of qualitative/categorical data, there are different methodological approaches to produce estimates by exploiting all available information. Latent variable models may help take explicitly into account deficiencies in the measurement process of both survey and administrative sources. Machine learning techniques are frequently used to classify large amounts of data. The use of Hidden Markov Model and Machine Learning methods is described in the labour statistics context to predict the individual employment status. The relevant data may be drawn from the labour force survey conducted by Istat and from several administrative sources that Istat regularly acquires from external bodies.