CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Online adaptive and anytime Mondrian forests Authors:  Stephane Gaiffas - Universite Paris-Diderot (France) [presenting]
Jaouad Mourtada - Ecole polytechnique (France)
Erwan Scornet - Polytechnique (France)
Abstract: Random Forests (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such methods comes from a combination of several features: a remarkable accuracy in a variety of tasks, the small number of parameters to tune, the ability to handle both numerical and categorical outputs, their reasonable computational cost at training and prediction time, and their suitability in high-dimensional settings. The most commonly used RF variants however are offline algorithms, which require the availability of the whole dataset at once. We introduce an online anytime random forest algorithm based on Mondrian forests. Using a suitable adaptation of the context tree weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all labelled prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm, which is adaptive to the unknown regularity of the regression function. Numerical experiments show that our algorithm is competitive, compared to Breimans original random forests.