CMStatistics 2017: Start Registration
View Submission - CMStatistics
B1702
Title: Model selection as a multiple hypothesis testing procedure: Improving Akaike's information criterion Authors:  Adrien Saumard - Crest-Ensai (France) [presenting]
Fabien Navarro - CREST - ENSAI (France)
Abstract: By interpreting the model selection problem as a multiple hypothesis testing task in general, we propose a modification of Akaike's Information Criterion that avoids over-fitting. We call this correction an over-penalization of AIC. We prove nonasymptotic optimality of our procedure for histogram selection in density estimation, by establishing sharp oracle inequalities for the Kullback-Leibler divergence. A strong feature of our theoretical results is that they include the estimation of unbounded log-densities. To do so, we prove several analytical and probabilistic lemmas that are of independent interest. We also demonstrate the practical superiority of our over-penalization procedure over other model selection strategies in an extended, fully reproducible, experimental study. Our procedure is implemented in a R package.