Optimization Online


A Proximal Stochastic Gradient Method with Progressive Variance Reduction

Lin Xiao(lin.xiao***at***microsoft.com)
Tong Zhang(tzhang***at***stat.rutgers.edu)

Abstract: We consider the problem of minimizing the sum of two convex functions: one is the average of a large number of smooth component functions, and the other is a general convex function that admits a simple proximal mapping. We assume the whole objective function is strongly convex. Such problems often arise in machine learning, known as regularized empirical risk minimization. We propose and analyze a new proximal stochastic gradient method, which uses a multi-stage scheme to progressively reduce the variance of the stochastic gradient. While each iteration of this algorithm has similar cost as the classical stochastic gradient method (or incremental gradient method), we show that the expected objective value converges to the optimum at a geometric rate. The overall complexity of this method is much lower than both the proximal full gradient method and the standard proximal stochastic gradient method.

Keywords: stochastic gradient method, incremental gradient method, proximal gradient method

Category 1: Convex and Nonsmooth Optimization

Category 2: Stochastic Programming

Citation: Microsoft Research Tech Report: MSR-TR-2014-38, March 2014

Download: [PDF]

Entry Submitted: 03/19/2014
Entry Accepted: 03/22/2014
Entry Last Modified: 03/19/2014

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society