CMStatistics 2018: Start Registration
View Submission - CMStatistics
B1268
Title: Leveraging the power of correlation in a data network: A machine learning approach Authors:  Annalisa Appice - University of Bari Aldo Moro, Dipartimento di Informatica (Italy) [presenting]
Donato Malerba - University of Bari Aldo Moro (Italy)
Abstract: Predictive modelling of a data network is made complex due to the presence of correlation. Recent studies have shown that taking label correlations into account may contribute to improving the accuracy of predictive inferences in data network domains. The trend cluster is a space-time pattern defined in machine learning, in order to model the node correlation and the temporal dependence of a data network. Specifically, it describes any cluster of linked nodes which collect measures of a numeric field whose temporal variation, called trend polyline, is similar along a time horizon. Trend cluster discovery is, initially, investigated as an effective means to summarize a geophysical data network. Subsequently, it is combined with Inverse distance weighting and least-square regression, in order to derive a predictive model for the ubiquity interpolation of unobserved data. Finally, it is also investigated as a means to enrich a data network with forecasting ability. In particular, the forecasting ability is used to identify outliers, while the correlation of outliers is analysed, in order to classify changes and reduce the number of false anomalies.