CMStatistics 2018: Start Registration
View Submission - CMStatistics
Title: Model-based clustering of high dimensional data using copulas Authors:  Marta Nai Ruscone - Università degli Studi di Genova (Italy) [presenting]
Abstract: Finite mixtures are applied to perform model-based clustering of multivariate data. Existing models do not offer great flexibility for modelling the dependence of the data since they rely on potential undesirable correlation restrictions and strict assumptions on the marginal distribution. We proposed recently a model-based clustering method via R-vine copula that allows overcoming the previous restrictions by building flexible dependence models for an arbitrary number of variables using bivariate building blocks. This method shows a disappointing behavior in high-dimensional spaces since it leads to over-parametrized models. We propose a more parsimonious version of model-based clustering method via R-vine copula to alleviate the computational burden and the risk of overfitting. The model is based on the selection of the hyper-parameters of sparse model classes using truncated and thresholded R-vine copulas. We use simulated and real datasets to illustrate the proposed procedure.