CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Model-based clustering of variables Authors:  Vincent Vandewalle - Inria (France) [presenting]
Thierry Mottet - Inria (France)
Abstract: In the clustering of variables framework, the goal is to cluster together similar variables based on a distance between variables. This distance can be easily defined when dealing with variables of the same type, but is more difficult to define when dealing with variables of different types. In this communication we propose a model-based clustering of variables. It consists in grouping together variables defining the same groups of individuals. In variables of the same cluster, a conditional independence model is assumed for the clustering of the individuals. This model has the advantage of only needing to define probability distribution functions from the univariate point of view, and allows a simple clustering of variables (one clustering per cluster of variables) when partitions of individuals are known. The clustering of variables becomes a model selection issue which is answered by optimizing the BIC criterion through a modified version of the EM algorithm. The proposed approach is illustrated on simulated and real data.