CMStatistics 2022: Start Registration
View Submission - CMStatistics
B0514
Title: Simultaneous variable selection and fusion of categorical covariates levels in penalized logistic regression Authors:  Lea Kaufmann - RWTH Aachen University (Germany) [presenting]
Maria Kateri - RWTH Aachen University (Germany)
Abstract: In penalized logistic regression for high-dimensional data, including categorical covariates, dimension reduction of the parameter vector can be achieved not only through variable selection but also through the informative fusion of factor levels. For this purpose, a new regularization technique called $L_{0}$-fused group lasso, which simultaneously performs factors selection and fusion of their levels, is introduced. The factors selection procedure is enforced by a group lasso penalty while the levels fusion is based on the $L_{0}$ ``norm'' on the differences of the corresponding coefficients, suitably adjusted for nominal and ordinal covariates. Theoretical properties, such as existence and $\sqrt{n}$-consistency of estimators, along with oracle properties, are investigated for the fixed dimensional case. These results are extended to the case of a diverging number of parameters growing with the sample size. Further, algorithms for handling the associated non-convex optimization problem and obtaining the $L_{0}$-fused group lasso estimators are developed. The performance of the proposed procedure is investigated by simulation studies.