Title: Latent group structure and regularized regression
Authors: Konstantinos Perrakis - Department of Mathematical Sciences, Durham University (United Kingdom) [presenting]
Thomas Lartigue - Aramis Project Team Inria (France)
Frank Dondelinger - Lancaster University (United Kingdom)
Sach Mukherjee - German Center for Neurodegenerative Diseases (Germany)
Abstract: Regression models generally assume that the conditional distribution of response $Y$ given features $X$ is the same for all samples. For heterogeneous data with distributional differences among latent groups, standard regression models are ill-equipped, especially in large multivariate problems where hidden heterogeneity can easily pass undetected. To allow for robust and interpretable regression modeling in this setting, we propose a class of regularized mixture models that couples together both the multivariate distribution of $X$ and the conditional $Y|X$. This joint modeling approach offers a novel way to deal with suspected distributional shifts, which allows for automatic control of confounding by latent group structure and delivers scalable, sparse solutions. Estimation is handled via an expectation-maximization algorithm, whose convergence is established theoretically. We illustrate the key ideas via empirical examples.