- | ||||
|
![]()
|
A Progressive Batching L-BFGS Method for Machine Learning
Raghu Bollapragada(raghu.bollapragada Abstract: The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method. Keywords: Nonconvex Optimization, Stochastic Optimization, Deep Learning, Sample Selection Category 1: Nonlinear Optimization Category 2: Convex and Nonsmooth Optimization (Convex Optimization ) Category 3: Stochastic Programming Citation: Download: [PDF] Entry Submitted: 02/14/2018 Modify/Update this entry | ||
Visitors | Authors | More about us | Links | |
Subscribe, Unsubscribe Digest Archive Search, Browse the Repository
|
Submit Update Policies |
Coordinator's Board Classification Scheme Credits Give us feedback |
Optimization Journals, Sites, Societies | |
![]() |