Title: Classification based on dissimilarities towards prototypes
Authors: Beibei Yuan - Leiden University (Netherlands) [presenting]
Willem Heiser - Leiden University (Netherlands)
Mark De Rooij - Leiden University (Netherlands)
Abstract: The delta-machine, a statistical learning tool for classification based on dissimilarities or distances, is introduced. In the first step dissimilarities between profiles of the objects and a set of selected exemplars or prototypes in the predictor space are computed. In the second step, these dissimilarities take the role as predictors in a logistic regression to build classification rules. This procedure leads to nonlinear classification boundaries in the original predictor space. We discuss the delta-machine with mixed nominal, ordinal, and numerical predictor variables. Two dissimilarity measures are distinguished: the Euclidean distance and the Gower measure. The first is a general distance measure, while the second is a tailored dissimilarity measure for mixed type of variables. Using simulation studies we compared the performance of the two dissimilarity measures in the delta-machine using three types of artificial data. The simulation studies showed that overall the Euclidean distance and the Gower measure had similar performances in terms of the accuracy, but in some conditions the Euclidean distance outperformed the Gower measure. Furthermore, we will show the classification performance of the delta-machine in comparison to three other classification methods on an empirical example.