Title: Machine learning for spatially aggregated data
Authors: Bo Li - University of Illinois at Urbana-Champaign (United States)
Peng Wang - University of Cincinnati (United States) [presenting]
Abstract: In recent years, statistical machine learning approaches has been extremely popular largely due to its superior performance in prediction. Of all the commonly used machine learning tools, the gradient boosting tree is usually the favored vehicle for many practitioners. On the popular data analytics competition platform Kaggle, gradient boosting is the winning algorithm for almost every structured data. Besides its superior prediction performance, the gradient boosting trees also enjoys the interpretablility of a non-parametric additive model and its fitting algorithm can be paralleled. We extend this powerful machine learning technique to the realm of spatial data analysis. The proposed approach does not require any parametric assumption on the spatial correlations and enjoy all the advantages of gradient boosting. We illustrate the potential of the data with application on prediction of HIV new diagnose rates for all counties of the United States.