CMStatistics 2022: Start Registration
View Submission - CMStatistics
B1708
Title: Recent results in validating and benchmarking mixed-type clustering Authors:  Gero Szepannek - Stralsund University of Applied Sciences (Germany) [presenting]
Rabea Aschenbruck - Stralsund University of Applied Sciences (Germany)
Abstract: A straightforward extension of the well-known $k$ means clustering algorithm for mixed-type data is given by $k$ prototypes as implemented in the clustMixType R package. Nonetheless, in the recent past, several other algorithms have been proposed. An important challenge (not only in clustering) consists in model selection. For clustering, this covers, in particular, the number of clusters but also the selection of variables or the appropriate algorithm. Benchmarking studies may help to provide guidance on these decisions. For the purpose of clustering mixed-type data, so far, only a few benchmarking results are available. Challenges are given by setting up appropriate simulation designs or cluster validation. The aim is to give an overview and discussion of existing work as well as recent results, which may help to explore the landscape of mixed-type clustering algorithms.