CMStatistics 2022: Start Registration
View Submission - CMStatistics
Title: Supervised dimensionality reduction method for heterogeneous sources data Authors:  Kenta Sakamoto - Doshisha University (Japan) [presenting]
Hiroshi Yadohisa - Doshisha University (Japan)
Abstract: Numerous studies have been conducted on multiple multivariate datasets in biology and information retrieval. In these studies, multiple multivariate data were obtained from multiple sources for the same individual and simultaneously analyzed. Information potentially relevant to the individual is often obtained separately from multivariate data in these studies. This information is called label information. Especially in biology, studies have prioritized extracting label information. However, previous studies have not considered the heterogeneity of each information source. Consequently, determining which information sources are relevant to the label remains difficult. Additionally, biological studies have indicated that some information sources may not contribute to label identification, and using information sources that have little relevance to labels for learning is not the best option. Therefore, reducing the influence of information sources that contribute little to label identification is necessary. Therefore, a method is proposed for stable learning even when including information sources that are not very relevant for label identification. Additionally, we quantitatively evaluated the importance of each information source on the label. This was expected to facilitate the interpretation of the characteristics of each information source.