Title: Computational properties of solution methods for logistic regression through the lens of condition numbers
Authors: Robert Freund - MIT (United States)
Paul Grigas - University of California, Berkeley (United States) [presenting]
Rahul Mazumder - MIT (United States)
Abstract: The simple probabilistic model underlying logistic regression suggests that it is most natural and sensible to apply logistic regression when the data is not linearly separable. Building on this basic intuition, we introduce a pair of condition numbers that measure the degree of non-separability or separability of a given dataset (in the setting of binary classification). When the sample data is not separable, the degree of non-separability naturally enters the analysis and informs the properties and convergence guarantees of a wide variety of computational methods, including standard first-order methods such as steepest descent, greedy coordinate descent, etc. When the sample data is separable -- in which case standard logistic regression will break down -- the degree of separability can be used to show, rather surprisingly, that standard first order methods with implicit regularization also deliver approximate-maximum-margin solutions with associated computational guarantees as well. Finally, in order to further enhance our understanding of the computational properties of several methods, we also take advantage of recent new results on self-concordant-like properties of logistic regression due to Bach.