Title: Variable selection for longitudinal biomarkers constrained by a detection limit
Authors: Julia Geronimi - CNAM (France) [presenting]
Gilbert Saporta - CNAM (France)
Abstract: Repeated measures over time are common in the biomedical field, and widely used to analyze the link between covariates and a clinical criterion. In a longitudinal context, a high number of variables associated with the presence of missing data, are complex issues to be resolved. We deal with several types of covariates, some suffer from haphazard missingness, and others are subject to detection thresholds. For the latter, Tobit regression combined with bootstrap is an unbiased approach, but it needs complete predictors for the mean model. An adaptation of the well-known multivariate imputation by chained equation is proposed. We use the Tobit model as the imputation method for covariates below the detection limit, predictive mean matching and logistic regression for others. Variable selection is done by using MI-PGEE which consists in the following ingredients: a) a group LASSO penalty is imposed on the group of estimated regression coefficients of the same variable across multiply-imputed datasets leading to a consistent selection. The optimal shrinkage parameter is chosen by minimizing a BIC-like criterion. b) GEE allows integrating correlations due to the longitudinal context. The usefulness of the new method is illustrated by an application on the FNIH project of the Osteoarthritis Initiative.