Title: Clustering methods for consumer credit risk modelling
Authors: Yazhe Li - Imperial College London (United Kingdom) [presenting]
Niall Adams - Imperial College London and University of Bristol (United Kingdom)
Tony Bellotti - Imperial College London (United Kingdom)
Abstract: For credit risk modelling, we propose applying automated clustering on the class of default loans as a precursor to building a credit risk model using logistic regression. We illustrate an application of this method using simulated and real data sets, mortgages and credit cards, using k-means, k-medoids and hierarchical clustering. Our results show that clustering can enhance predictive performance when defaults can be demonstrably clustered into well-separated clusters. We explain the motivation by asymptotic results demonstrating that logistic regression only uses the rare class data points via the rare class mean vector in highly imbalanced data problems, such as default modelling.