Title: Evaluating probabilistic classifiers: The Triptych
Authors: Timo Dimitriadis - Heidelberg University (Germany) [presenting]
Tilmann Gneiting - Heidelberg Institute for Theoretical Studies (Germany)
Alexander Jordan - Heidelberg Institute for Theoretical Studies (Germany)
Peter Vogel - (Germany)
Abstract: Predicting the occurrence probability of binary events is presumably the most common forecasting task throughout the sciences. Hence, a unified methodology for evaluating and comparing these forecasts is of great importance. We propose a new ``triptych'' of evaluating displays consisting of the receiver operating characteristic (ROC) curve, the CORP reliability diagram, and the Murphy diagram. Individually, these three displays focus on different and complementary aspects of the forecast's performance. The ROC curve assesses discrimination, the reliability diagram evaluates calibration, and the Murphy diagram combines both properties and visualizes overall predictive ability. In combination, these displays visualize the full generality of a forecast's predictive ability. This intuition is supported by showing the first theoretical result connecting these plots in full generality: For auto-calibrated forecasts, the ROC curve and the Murphy diagram display congruent information. We illustrate our proposal through four case studies ranging from astrophysics, meteorology, economics, and social science.