CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Distributed and private model-based clustering Authors:  Kaleb Leemaqz - University of Queensland (Australia) [presenting]
Abstract: Privacy is becoming increasingly important in collaborative data analysis, especially those involving personal or sensitive information commonly arising from health and commercial settings. The aim of privacy preserving statistical algorithms is to allow inference to be drawn on the joint data without disclosing private data held by each party. We propose a privacy-preserving EM (PPEM) algorithm, a novel scheme for training mixture models for clustering in a privacy-preserving manner. We focus on the case of horizontally distributed data among multiple parties for which cooperative learning is required. More specifically, each party wishes to learn the global parameters of the mixture model while preventing the leakage of party-specific information, including any intermediate results that may potentially be traced to an individual party. Another advantage of PPEM is that it does not involve a trusted third party, unlike most existing schemes that implement a master/slave hierarchy. This helps prevent information leakage in the case of a corrupted party. For illustration, PPEM is applied to the widely popular Gaussian mixture model (GMM) and its effectiveness is analysed through a security analysis.