Title: Clutering in reduced space of high-dimensional binary data
Authors: Tadashi Imaizumi - Tama University (Japan) [presenting]
Abstract: In Behavioral Science, or Marketing Science etc, we need to analyze a binary data matrix whose cells represent whether a respondent responded to an item or not. When this binary data matrix has few factors, less than 30, Multiple Correspondence Analysis (MCA) or Joint Correspondence Analysis (JCA) have been applied, and they are useful in analyzing Bart tables. When the number of factors is larger, for example, 100 or 200, it will be hard to understand the derived results. So, a new method will be proposed for analyzing this type of binary data. In this method, an unknown number of $G$ clusters are assumed. The response on each binary variable is also assumed to be connected with the distance between a cluster center and the center point on the dimension representing that variable. Each of the $G$ cluster centers will be embedded as a point in a lower dimensional space, typically, a 2-dimensional space. They are derived by the orthonormal rotation of the cluster center in a higher dimensional space. This orthonormal matrix will be derived by maximized the between variance among the cluster centers in the lower dimensional space. An application of the proposed method to a real data set will be shown.