Title: Tuning and tunability: Importance of hyperparameters of machine learning algorithms
Authors: Anne-Laure Boulesteix - LMU Munich (Germany)
Bernd Bischl - LMU Munich (Germany)
Philipp Probst - LMU Munich (Germany) [presenting]
Abstract: Modern machine learning algorithms for classification or regression such as gradient boosting, random forest and neural networks involve a number of parameters that have to be fixed before running them. Such parameters are commonly denoted as hyperparameters. Users of these algorithms can use defaults of the hyperparameters that are specified in the employed software package, set them to alternative specific values or use a tuning strategy to optimize them with respect to performance for the specific dataset at hand. We formalize the problem of tuning from a statistical point of view and suggest general measures quantifying the tunability of hyperparameters and of algorithms. They are calculated for six of the most common statistical learning algorithms. Our results may help users and software developers to set defaults appropriately, to decide whether it is worth to conduct a possibly time consuming tuning strategy, to focus on the most important hyperparameters and to choose adequate hyperparameter spaces or even prior distributions for tuning strategies like sequential model-based optimization. This is one step in the automation of the model building process which consists of several steps such as feature creation and selection, tuning, stacking, etc. and which is partly already available in implementations like auto-sklearn, AutoWeka and H$_2$O AutoML. Ideally the time of this process can be estimated and restricted before execution.