Title: Model-assisted estimation through random forests in finite population sampling
Authors: Mehdi Dagdoug - Universite de Bourgogne Franche-Comte (France) [presenting]
Camelia Goga - Universite de Bourgogne (France)
David Haziza - Université de Montréal (Canada)
Abstract: Estimation of finite population parameters is of primary interest in survey sampling. At the estimation stage, auxiliary information is often available for all population units. The model-assisted approach uses this supplementary source of information to construct estimators. We propose new classes of model-assisted estimators based on random forests. Generally speaking, random forest is an ensemble method which consists of creating a large number of regression trees and combining them to produce more accurate predictions than a single regression tree would. Under mild conditions, the proposed model-assisted estimators are shown to be asymptotically design unbiased and consistent. A consistent variance estimator is proposed. The asymptotic distribution of the estimators is obtained, allowing for the use of confidence intervals. Simulations illustrate that the proposed estimator is efficient and can outperform state-of-the-art estimators, especially in complex settings.