-

 

 

 




Optimization Online





 

Adaptive Gradient Descent without Descent

Yura Malitsky (y.malitsky***at***gmail.com)
Konstantin Mishchenko (konstantin.mishchenko***at***kaust.edu.sa)

Abstract: We present a strikingly simple proof that two rules are sufficient to automate gradient descent: 1) don't increase the stepsize too fast and 2) don't overstep the local curvature. No need for functional values, no line search, no information about the function except for the gradients. By following these rules, you get a method adaptive to the local geometry, with convergence guarantees depending only on smoothness in a neighborhood of a solution. Given that the problem is convex, our method will converge even if the global smoothness constant is infinity. As an illustration, it can minimize arbitrary continuously twice-differentiable convex function. We examine its performance on a range of convex and nonconvex problems, including matrix factorization and training of ResNet-18.

Keywords: gradient descent, adaptivity, stepsize, linesearch

Category 1: Convex and Nonsmooth Optimization (Convex Optimization )

Citation:

Download: [PDF]

Entry Submitted: 10/29/2019
Entry Accepted: 10/29/2019
Entry Last Modified: 10/31/2019

Modify/Update this entry


  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository

 

Submit
Update
Policies
Coordinator's Board
Classification Scheme
Credits
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society