Title: A prediction divergence criterion for model selection in high dimensional settings
Authors: Maria-Pia Victoria-Feser - University of Geneva (Switzerland) [presenting]
Stephane Guerrier - University of Illinois at Urbana-Champaign (United States)
Marco Avella Medina - MIT (United States)
Abstract: A new class of model selection criteria is proposed which is suited for stepwise approaches or can be used as selection criteria in penalized estimation based methods. This new class, called the d-class of error measure, generalizes Efron's q-class. This class not only contains classical criteria such as Mallow's Cp or the AIC, but also enables one to define new criteria that are more general. Within this new class, we propose a model selection criterion based on a prediction divergence between two nested models' predictions that we call the Prediction Divergence Criterion (PDC). The PDC provides a different measure of prediction error than a criterion associated to each potential model within a sequence and for which the selection decision is based on the sign of differences between the criteria. The PDC directly measures the prediction error divergence between two nested models and provides different criteria. As an example, we consider linear regression models and propose a PDC criterion that is the direct counterpart of Mallow's Cp. We show that a selection procedure based on the PDC, compared to the Cp, has a smaller probability of overfitting and a negligible asymptotic probability of selecting a larger model for models with more than one additional non significant covariate. The PDC is particularly well suited in high dimensional and sparse situations and also under small model misspecifications.