Title: Similarity forests for time series classification
Authors: Laura Calzada - University of Oviedo (Spain) [presenting]
Maria Oskarsdottir - KU Leuven (Belgium)
Bart Baesens - KU Leuven (Belgium)
Abstract: The telecommunication industry is a saturated market where a proper implementation of a retention campaign is critical to be competitive due to the fact that retaining a customer is cheaper than achieving a new one. Some research has used binary classification methods to predict churn of customers. Moreover, it has been shown that a customer's social relationships have an influence on the decision of changing of the operator. So, it is crucial to detect customer behavioral patterns and define accurate models to predict potential churners. We propose a novel method to extract the dynamic influence of each customer using social network analysis techniques, predicting accurately both in short and long-term with binary classification methods. Call detail records of telcos customers are used to build a temporal network to extract the churn behavioral patterns. The dynamic influence of each customer is determined by applying centrality metrics and diffusion propagation methods over a sliding window. The time series are classified by a recently proposed binary classification method called similarity forests. In addition, a comparison to other methods like logistic regression and random forests evaluates the accuracy of predicting further in time and the possibility of designing a method that is capable of detecting potential churners in short and long term.