Optimization Online


Distributional Robustness and Regularization in Statistical Learning

Rui Gao(rgao32***at***gatech.edu)
Xi Chen(xchen3***at***stern.nyu.edu)
Anton Kleywegt(anton***at***isye.gatech.edu)

Abstract: A central question in statistical learning is to design algorithms that not only perform well on training data, but also generalize to new and unseen data. In this paper, we tackle this question by formulating a distributionally robust stochastic optimization (DRSO) problem, which seeks a solution that minimizes the worst-case expected loss over a family of distributions that are close to the empirical distribution in Wasserstein distances. We establish a connection between such Wasserstein DRSO and regularization. More precisely, we identify a broad class of loss functions, for which the Wasserstein DRSO is asymptotically equivalent to a regularization problem with a gradient-norm penalty. Such relation provides new interpretations for problems involving regularization, including a great number of statistical learning problems and discrete choice models (e.g. multinomial logit). The connection also suggests a principled way to regularize high-dimensional and non-convex problems, which is demonstrated through the training of Wasserstein generative adversarial networks in deep learning.

Keywords: Wasserstein distance; regularization; deep learning; generative adversarial networks; choice model

Category 1: Stochastic Programming

Category 2: Robust Optimization

Category 3: Applications -- OR and Management Sciences


Download: [PDF]

Entry Submitted: 01/05/2018
Entry Accepted: 01/05/2018
Entry Last Modified: 01/05/2018

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society