CMStatistics 2020: Start Registration
View Submission - CMStatistics
Title: The linear additive tree Authors:  Efstathios Gennatas - University of California, San Francisco (United States) [presenting]
Jerome Friedman - Stanford University (United States)
Eric Eaton - University of Pennsylvania (United States)
Charles Simone II - New York Proton (United States)
Lyle Ungar - University of Pennsylvania (United States)
Lei Xing - Stanford University (United States)
Gilmer Valdes - UCSF (United States)
Abstract: The Linear Additive Tree (LINAD) is a novel algorithm that builds highly accurate and interpretable decision trees with linear models in the terminal nodes. An extension of the Additive Tree, LINAD capitalizes on the complementary strengths of decision trees and linear models with the additive training of gradient boosting and can be considered a generalization of these three algorithms. A single LINAD is fully interpretable and rivals the performance of ensemble techniques, while an ensemble of LINADs can match or surpass current ensembles based on traditional decision trees. The algorithm's performance is demonstrated using a collection of 72 real and synthetic publicly available datasets. Across the 72 benchmarks, LINAD ranked ahead of random forest and just behind gradient boosting. At the same time, LINAD ensembles without any tuning matched the performance of gradient boosting while using many fewer trees. Following these successful benchmarks, LINAD will be applied on magnetic resonance scans from a large cohort of children (>10k). High-dimensional neuroimaging data will be used to derive patterns of white and grey matter structure that predict neurocognitive performance. LINAD offers both accuracy and interpretability to foster discovery in basic research and provide trustworthiness in critical applications like clinical predictive modeling.