Title: Permutation based variable importance determination for deep learning
Authors: Matthias Medl - University of Natural Resources and Life Sciences, Vienna (Austria)
Theresa Scharl - Boku Vienna (Austria)
Astrid Duerauer - University of Natural Resources and Life Sciences Vienna (Austria)
Friedrich Leisch - Universitaet fuer Bodenkultur Vienna (Austria)
Matthias Medl - University of Natural Resources and Life Sciences, Vienna (Austria) [presenting]
Abstract: Statistical models with the capability to predict process variables which cannot be measured in real time have become an effective tool to monitor biopharmaceutical production processes. The implementation of novel measurement devices with the capacity to capture a wide array of physical properties of process intermediates online has led to the expansion of the variable space available to generate these models. However, extracting all information contained within this high-dimensional variable space presents a challenge. To overcome this challenge, we propose a deep-learning framework capable of processing the whole variable space to estimate critical process parameters in real time, e.g. product or impurity concentrations of a biopharmaceutical purification process. The models consist of two parallel strands that are later concatenated. One strand leverages the pattern recognition capabilities of convolutional layers to process spectral data, while the other one processes single-variable measurements and contains fully connected layers. In order to gain insight into the inner workings of the models, a permutation-based methodology has been developed to estimate the variable importance for each time point throughout the process. The variable importance workflow has subsequently been validated on artificial data with a similar data structure, where the variable importance has been predefined and was thus known.