COMPSTAT 2016: Start Registration
View Submission - COMPSTAT
Title: Trimming in probabilistic clustering Authors:  Gunter Ritter - University of Passau (Germany) [presenting]
Abstract: The normal mixture model is a popular tool for decomposing grouped multivariate data sets in their clusters. The preferred method for estimating its parameters is nowadays the likelihood paradigm. It is well known that an ML estimator does not exist but the likelihood function possesses a consistent local maximum. Consistency of a constrained MLE is due to Hathaway. A combination of both theorems leads to a trade-off between likelihood and scale balance and the SBF plot (scale balance vs. fit) leading to possible solutions. It is well known that the likelihood estimate is not robust against outliers. Some common methods of robustification, such as applying Huber's robust M-estimators to parameter estimation, adding an additional component in order to take account of outliers, or using elliptical distributions, show a clear gain in robustness. However, it was noted that they are not effective against {\em gross} outliers, their asymptotic breakdown point being zero. Effective protection against outliers is trimming. It leads to a transportation problem which can here be efficiently solved. The asymptotic breakdown point of the resulting method is strictly positive, an indication of robustness even against gross outliers. Some applications to synthetic and real data illustrate the method.