Optimization Online


Distributionally Robust Logistic Regression

Soroosh Shafieezadeh-Abadeh (soroosh.shafiee***at***epfl.ch)
Peyman Mohajerin Esfahani (peyman.mohajerin***at***epfl.ch)
Daniel Kuhn (daniel.kuhn***at***epfl.ch)

Abstract: This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown data-generating distribution with high confidence. We then formulate a distributionally robust logistic regression model that minimizes a worst-case expected logloss function, where the worst case is taken over all distributions in the Wasserstein ball. We prove that this optimization problem admits a tractable reformulation and encapsulates the classical as well as the popular regularized logistic regression problems as special cases. We further propose a distributionally robust approach based on Wasserstein balls to compute upper and lower confidence bounds on the misclassification probability of the resulting classifier. These bounds are given by the optimal values of two highly tractable linear programs. We validate our theoretical out-of-sample guarantees through simulated and empirical experiments.

Keywords: Distributionally robust optimization, logistic regression, Wasserstein distance

Category 1: Robust Optimization

Category 2: Stochastic Programming

Citation: Risk Analytics and Optimization Chair, EPFL

Download: [PDF]

Entry Submitted: 09/30/2015
Entry Accepted: 09/30/2015
Entry Last Modified: 12/01/2015

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society