Title: Random forests for time series
Authors: Jean-Michel Poggi - University Paris-Saclay Orsay (France) [presenting]
Yannig Goude - EDF (France)
Hui Yan - EDF (France)
Benjamin Goehry - University Paris-Sud Orsay (France)
Pascal Massart - University Paris-Sud (France)
Abstract: Random forests were introduced in 2001 by Breiman and have since become a popular learning algorithm, for both regression and classification. However, when dealing with time series, random forests do not integrate the time-dependent structure, implicitly supposing that the observations are independent, and treat each instant as an independent observation. We propose some variants of the random forests designed for time series. The idea is to replace the standard bootstrap with a dependent bootstrap (i.e. block bootstrap) to subsample time series during the tree construction phase and take time dependence into account. We then present two numerical experiments on electricity load forecasting. The first one, at a disaggregated level, is based on an application to load forecasting of a building and illustrate how the variants may perform. The second one is at a more aggregated level (French national forecasting) but focusing on atypical periods. For both, we explore a heuristic for the choice of the block size, the new parameter. In addition, some additional experiments with generic time series data are also performed and shortly commented. Finally, our R package rangerts is freely available from the GitHub.