Title: A Bayesian non-parametric approach for multivariate peak over threshold models and anomaly detection
Authors: Bruno Sanso - University of California Santa Cruz (United States) [presenting]
Peter Trubey - University of California Santa Cruz (United States)
Abstract: A constructive definition of the multivariate Pareto is considered that factorizes the random vector into a radial component and an independent angular component, using the infinity norm. We propose a method for inferring the distribution of the angular component whose support is the limit of the positive orthants of the unit p-norm spheres. We introduce a projected gamma family of distributions defined as the projection of a vector of independent gamma random variables onto the p-norm sphere. This family serves as a building block for a flexible family of distributions obtained as a Dirichlet process mixture of projected gammas. For model assessment and comparison, we discuss model scoring methods appropriate to distributions on the unit hypercube. In particular, working with the energy score criterion, we develop a kernel metric appropriate to the infinity norm unit hypercube that produces a proper scoring rule. We leverage this score for the detection of observations that are anomalous when compared to their predictive distribution. We consider simulated data, as well as data corresponding to integrated vapor transport (IVT), a variable that describes the rate of flow of moisture in the atmosphere along the coast of California for the years 1979 through 2020. We find a clear but heterogeneous geographical dependence.