Title: Regularized clusterwise multiblock regression
Authors: Stephanie Bougeard - ANSES (France) [presenting]
Ndeye Niang - CNAM (France)
Gilbert Saporta - CNAM (France)
Abstract: Regression coefficients are usually estimated under the assumption that observations come from a single and homogeneous population. However in many applications, this assumption is not true and the overall model is not efficient to recover the specificities of potential cluster models. When variables are in addition structured into a dependent block of variables and several blocks of explanatory ones, we propose a new method called regularized clusterwise multiblock regression. The aims of this method are to find out simultaneously thanks to a single criterion: a data reduction of explanatory variables through components that can be intermediate between the ones from multiblock PLS and multiblock Redundancy Analysis, a partition of the observations into several clusters and the corresponding cluster multiblock regression coefficients. The three unknown parameters of this criterion, namely the regularization parameter which aims at stabilize the inversion of the block variance-covariance matrices, the number of components and the number of clusters, are all defined such as to minimize the prediction error on the basis of a ten-fold cross-validation. A simulation study is carried out to assess the performance of the method and an empirical application in the field of consumer satisfaction is provided which illustrates the usefulness of the method.