CMStatistics 2021: Start Registration
View Submission - CMStatistics
Title: A stable network inference procedure for high dimensional data Authors:  Emilie Devijver - CNRS (France) [presenting]
Remi Molinier - Universite Grenoble Alpes - Institut Fourier (France)
Melina Gallopin - Universite Paris Sud (France)
Abstract: The stability of variable selection procedures is crucial on high dimensional data: one hopes that active variables do not depend on the observed sample, in the sense that new observations would not change the active set. In the context of network inference based on Gaussian Graphical Models, l1-penalized log-likelihood methods are not stable when the number of observations is limited. Adding a structure to the estimation problem can lead to more stable results. We demonstrate theoretical guarantees for the stability of network inference based on hierarchical clustering and a non-asymptotic model selection criterion. Unlike state-of-the-art methods to deal with stability, the inference procedure is not based on data resampling. The theoretical guarantees are derived from the stability properties of single-linkage hierarchical clustering, based on topological considerations. The proposed network inference method is particularly relevant in real data analysis when a large number of observations is difficult to obtain, such as network inference from gene expression data. Numerical experiments, on simulated and real datasets, support the theoretical guarantees.