CMStatistics 2017: Start Registration
View Submission - CMStatistics
Title: Convex variable selection for high-dimensional linear mixed models Authors:  Jozef Jakubik - Institute of Measurement Science, Slovak Academy of Sciences (Slovakia) [presenting]
Abstract: Analysis of high-dimensional data is currently a current field of research, thanks to many applications e.g. in genetics (DNA data in genome-wide association studies), spectrometry or web analysis. The type of problems that tend to arise in genetics can often be modelled using high-dimensional linear mixed models because linear mixed models allow us to specify the covariance structure of the models. This enables us to capture relationships in data such as the population structure, family relatedness, etc. The high-dimensional setting presents specific theoretical as well as computational challenges. For high-dimensional linear mixed models there exist a few approaches based on \(\ell_1\) penalization or SCAD\@. These methods lead in general to non-convex problems. We present a convex approach to variable selection in high-dimensional linear mixed models for data with dimension over \(10^5\), where current non-convex approaches often fail. Our method provably ensures consistent variable selection with a growing number of observations. The method achieves good results in experiments on synthetic as well as real data.