CMStatistics 2021: Start Registration
View Submission - CMStatistics
Title: Predicting Helicobacter pylori serostatus: a comparison of classification algorithms Authors:  Emmanuelle Dankwa - University of Oxford (United Kingdom) [presenting]
Martyn Plummer - University of Warwick (United Kingdom)
Christiana Kartsonaki - University of Oxford (United Kingdom)
Abstract: The Western blot test for Helicobacter pylori (H. pylori) infection, although well established, is more labour intensive and uses a larger amount of plasma than the alternative high-throughput multiplex serology test. Given that the tests differ slightly on the H. pylori proteins (antigens) considered, it was of interest to calibrate the results of multiplex serology to those of Western blot and to determine the relative importance of various antigens in determining H. pylori serostatus. We employed five classification algorithms: logistic regression (LR), random forest (RF), elastic net, Bayesian additive regressive trees (BART)and multidimensional monotone BART. These were trained on multiplex serology antigen-specific reactivity values and corresponding Western blot results. The predictive performance of models was compared using the Brier score, logarithmic score and the area under the receiver operating characteristic curve (AUC). All models showed good discriminative ability on a test set (min. and max. AUC: 95\% and 97\%, respectively). By the Brier score, BART displayed the best predictive performance, although the differences in performance scores of the five models were not substantial. The BART, LR and RF models showed high levels of agreement on antigen importance rankings, and results corroborated those of previous studies. This study demonstrates the utility of classification algorithms in calibrating H. pylori multiplex serology to Western blot.