CMStatistics 2022: Start Registration
View Submission - CMStatistics
Title: Practical implementation of Hilbert space reduced-rank Bayesian Gaussian processes for spatial and temporal data Authors:  Gabriel Riutort-Mayol - Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO) (Spain) [presenting]
Paul-Christian Burkner - Excellence Cluster for Simulation Technology - University of Stuttgart (Germany)
Michael Riis Andersen - Department of Applied Mathematics and Computer Science - Technical University of Denmark (Denmark)
Arno Solin - Department of Computer Science - Aalto University (Finland)
Aki Vehtari - Aalto University (Finland)
Abstract: Gaussian processes (GPs) are powerful non-parametric probabilistic models for stochastic functions, widely used for spatial and temporal data. However, the direct implementation entails a complexity that is computationally intractable when the number of observations is large, especially when estimated with fully Bayesian methods such as Markov chain Monte Carlo. A low-rank approximate Bayesian GP based on a basis function approximation via Laplace eigenfunctions for stationary covariance is implemented. The approach is simple and exhibits an attractive computational complexity due to its linear structure. However, the number of basis functions used in the approximation grows exponentially, and consequently, the computational requirements, with respect to the number of input dimensions. Practical guidelines on how to select the key factors of the method, such as the number of basis functions and the boundary factor, are provided. Furthermore, diagnostics for checking that the selected factors are adequate given the data to accurately fit the model are proposed. On that basis, an iterative procedure to achieve accurate approximation performance with minimal computational costs is also developed. Several illustrative examples of the performance and applicability of the method for simulated and real data in uni-dimensional (e.g. time-series data) and multi-dimensional (e.g. spatial and spatio-temporal data) cases are presented, using the probabilistic programming language Stan.