Optimization Online


Gradient Descent using Duality Structures

Thomas Flynn (tflynn***at***gradcenter.cuny.edu)

Abstract: In most applications of gradient-based optimization to complex problems the choice of step size is based on trial-and-error and other heuristics. A case when it is easy to choose the step sizes is when the function has a Lipschitz continuous gradient. Many functions of interest do not appear at first sight to have this property, but often it can be established with the right choice of underlying metric. We find a simple recipe for choosing step sizes when a function has a Lipschitz gradient with respect to any Finsler structure that verifies an exponential bound. These step sizes are guaranteed to give convergence, but they may be conservative since they rely on an exponential bound. However, when relevant problem structure can be encoded in the metric to yield a significantly tighter bound while keeping optimization tractable, this may lead to rigorous and efficient algorithms. In particular, our general result can be applied to yield an optimization algorithm with non-asymptotic performance guarantees for batch optimization of multilayer neural networks.

Keywords: optimization, neural networks,

Category 1: Nonlinear Optimization (Unconstrained Optimization )


Download: [PDF]

Entry Submitted: 08/01/2017
Entry Accepted: 08/01/2017
Entry Last Modified: 08/31/2017

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society