Optimization Online


Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

Lin Xiao (lin.xiao***at***microsoft.com)

Abstract: We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as $\ell_1$-norm for promoting sparsity. We develop extensions of Nesterov's dual averaging method, that can exploit the regularization structure in an online setting. At each iteration of these methods, the learning variables are adjusted by solving a simple minimization problem that involves the running average of all past subgradients of the loss function and the whole regularization term, not just its subgradient. In the case of $\ell_1$-regularization, our method is particularly effective in obtaining sparse solutions. We show that these methods achieve the optimal convergence rates or regret bounds that are standard in the literature on stochastic and online convex optimization. For stochastic learning problems in which the loss functions have Lipschitz continuous gradients, we also present an accelerated version of the dual averaging method.

Keywords: stochastic learning, online optimization, sparse regularization, dual averaging method, accelerated gradient method

Category 1: Convex and Nonsmooth Optimization

Category 2: Stochastic Programming

Citation: Journal of Machine Learning Research, Vol.11, pages 2543-2596, October, 2010. http://jmlr.csail.mit.edu/papers/v11/xiao10a.html


Entry Submitted: 04/15/2010
Entry Accepted: 04/15/2010
Entry Last Modified: 11/15/2010

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Programming Society