Optimization Online


A minibatch stochastic Quasi-Newton method adapted for nonconvex deep learning problems

Joshua Griffin (joshua.griffin***at***sas.com)
Majid Jahani (maj316***at***lehigh.edu)
Martin Takac (martin.taki***at***gmail.com)
Seyedalireza Yektamaram (alireza.yektamaram***at***sas.com)
Wenwen Zhou (wenwen.zhou***at***sas.com)

Abstract: In this study, we develop a limited memory nonconvex Quasi-Newton (QN) method, tailored to deep learning (DL) applications. Since the stochastic nature of (sampled) function information in minibatch processing can affect the performance of QN methods, three strategies are utilized to overcome this issue. These involve a novel progressive trust-region radius update (suitable for stochastic models), batched evaluation instead of the entire data set, for selecting gradient batch-size and a restart strategy when quasi-Netwon approximation accuracy deteriorates. We analyze the convergence properties of our proposed method and provide the required theoretical analysis for different components of our algorithm. The numerical results illustrate that our proposed methodology with the new adjustments outperforms the previous similar methods, and is competitive with the best tuned stochastic first-order methods, in cases where large batch-size is required. Finally, we empirically show that our method is robust to the choices of hyper-parameters, thus, requiring less tuning compared to Stochastic Gradient Descent (SGD) method.

Keywords: deep learning; quasi-newton methods; hessian-free methods

Category 1: Nonlinear Optimization

Category 2: Nonlinear Optimization (Unconstrained Optimization )

Citation: SAS Institute 100 SAS Campus Drive Cary, NC 27513; Industrial and Systems Engineering Lehigh University Bethlehem, PA 18015; Mohamed bin Zayed University of Artificial Intelligence (MBZUAI)

Download: [PDF]

Entry Submitted: 01/07/2022
Entry Accepted: 01/07/2022
Entry Last Modified: 01/13/2022

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society