Title: Inference for model-agnostic variable importance
Authors: Brian Williamson - Fred Hutchinson Cancer Research Center (United States)
Peter Gilbert - University of Washington and Fred Hutchinson Cancer Research Center (United States)
Noah Simon - UW Biostatistics (United States)
Marco Carone - University of Washington (United States) [presenting]
Abstract: In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response, that is, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such an assessment does not necessarily characterize the prediction potential of features and may provide a misleading reflection of the intrinsic value of these features. To address this limitation, we propose a general framework for nonparametric inference on interpretable algorithm-agnostic variable importance. We define variable importance as a population-level contrast between the oracle predictiveness of all available features versus all features except those under consideration. We then propose a nonparametric efficient estimation procedure that allows the construction of valid confidence intervals and tests, even when machine learning techniques are used.