Title: Generalized information criterion in high-dimensional model selection
Authors: Piotr Pokarowski - University of Warsaw (Poland)
Agnieszka Prochenka - University of Warsaw (Poland)
Michal Frej - University of Warsaw (Poland)
Wojciech Rejchel - University of Warsaw (Poland) [presenting]
Jan Mielniczuk - Institute of Computer Science Polish Academy of Sciences (Poland)
Abstract: Model selection is a fundamental challenge for data sets that contains (much) more predictors than the sample size. In many practical problems (from genetics or biology) finding a (small) set of significant predictors is as important (or even more) as accurate estimation or prediction. The screening-selection algorithm is presented that is based on minimization of the empirical risk with the lasso penalty in the first step and with the generalized information criterion in the second step. We prove model selection consistency of this procedure in a wide class of models containing generalized linear models, quantile regression and support vector machines. The quality of the procedure is also investigated in numerical experiments.