Optimization Online


Barzilai-Borwein Step Size for Stochastic Gradient Descent

Conghui Tan(chtan***at***se.cuhk.edu.hk)
Shiqian Ma(sqma***at***se.cuhk.edu.hk)
Yu-Hong Dai(dyh***at***lsec.cc.ac.cn)
Yuqiu Qian(qyq79***at***connect.hku.hk)

Abstract: One of the major issues in stochastic gradient descent (SGD) methods is how to choose an appropriate step size while running the algorithm. Since the traditional line search technique does not apply for stochastic optimization algorithms, the common practice in SGD is either to use a diminishing step size, or to tune a fixed step size by hand, which can be time consuming in practice. In this paper, we propose to use the Barzilai-Borwein (BB) method to automatically compute step sizes for SGD and its variant: stochastic variance reduced gradient (SVRG) method, which leads to two algorithms: SGD-BB and SVRG-BB. We prove that SVRG-BB converges linearly for strongly convex objective functions. As a by-product, we prove the linear convergence result of SVRG with Option I proposed in [10], whose convergence result is missing in the literature. Numerical experiments on standard data sets show that the performance of SGD-BB and SVRG-BB is comparable to and sometimes even better than SGD and SVRG with best-tuned step sizes, and is superior to some advanced SGD variants.

Keywords: Barzilai-Borwein method, step size, stochastic gradient descent

Category 1: Convex and Nonsmooth Optimization (Convex Optimization )

Category 2: Global Optimization (Stochastic Approaches )

Citation: arXiv:1605.04131

Download: [PDF]

Entry Submitted: 05/22/2016
Entry Accepted: 05/22/2016
Entry Last Modified: 05/22/2016

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society