CMStatistics 2022: Start Registration
View Submission - CMStatistics
Title: A marginalization approach to local regression and clustering with variable-dimension covariates Authors:  Fernando Quintana - Pontificia Universidad Catolica de Chile (Chile) [presenting]
Garritt Page - Brigham Young University (United States)
Matthew Heiner - Brigham Young University (United States)
Abstract: Incomplete covariate vectors are known to be problematic for estimation and inferences on model parameters, but their impact on prediction performance is less understood. We develop an imputation-free method that builds on a random partition model admitting variable-dimension covariates. Cluster-specific response models further incorporate covariates via linear predictors, facilitating the estimation of smooth prediction surfaces with relatively few clusters. The response models are analytically marginalized according to the pattern of missing covariates, yielding a local regression with internally consistent uncertainty propagation that utilizes only one set of coefficients per cluster. Aggressive shrinkage of these coefficients crucially regulates uncertainty due to missing covariates. The method allows in- and out-of-sample prediction for any missingness pattern, even if the pattern in a new subject's incomplete covariate vector was not seen in the training data. We demonstrate the model's effectiveness for nonlinear prediction under various circumstances, including non-random missingness mechanisms, by comparing it with other recent methods for variable-dimension regression on synthetic and real data.