Title: Application of random forests and ANOVA techniques to the aggregate modeling of road accident time series in Spain
Authors: Jose Mira - Universidad Politecnica de Madrid (Spain) [presenting]
Almudena Sanjurjo de No - UNIVERSIDAD POLITECNICA DE MADRID (Spain)
Camino Gonzalez - Universidad Politécnica de Madrid (Spain) (Spain)
Blanca Arenas - Universidad Politecnica de Madrid (Spain)
Abstract: The purpose is the application of machine learning techniques such as Random Forests to the macro (aggregate) modeling of road accident data in Spain, along the period 2004-2013. Although the number of people killed on the road has decreased dramatically in Spain in the last decade (4032 in 2003 vs 1160 in 2016), road accidents are still a major cause of death, particularly among young people, and considerable research is being carried out to further decrease the figures. Dynamic time series models have been traditionally applied to this kind of data, where the output dynamic variable is either the number of deaths or of fatal accidents, and the explanatory variables are the lags of the output and the evolution of fleets, economic variables, weather conditions and, very importantly, enforcement policies including number of agents on the road or changes in legislation. The same data have been used in a non parametric regression model such as CART-Random Forest. Also, an experimental design methodology has been developed for a more sophisticated and statistically-based input variable selection procedure.