- First-Order Algorithms Converge Faster than $O(1/k)$ on Convex Problems Ching-pei Lee (ching-peics.wisc.edu) Stephen Wright (swrightcs.wisc.edu) Abstract: It has been known for many years that both gradient descent and stochastic coordinate descent achieve a global convergence rate of $O(1/k)$ in the objective value, when applied to a scheme for minimizing a Lipschitz-continuously differentiable, unconstrained convex function. In this work, we improve this rate to $o(1/k)$. We extend the result to proximal gradient and proximal coordinate descent on regularized problems to show similar $o(1/k)$ convergence rates. The result is tight in the sense that an $O(1/k 1+\epsilon)$ rate is not generally attainable for any $\epsilon > 0$, for any of these methods. Keywords: Gradient descent methods; Coordinate descent methods; Proximal gradient methods; Convex optimization; Complexity Category 1: Convex and Nonsmooth Optimization Citation: Download: [PDF]Entry Submitted: 12/20/2018Entry Accepted: 12/20/2018Entry Last Modified: 01/24/2019Modify/Update this entry Visitors Authors More about us Links Subscribe, Unsubscribe Digest Archive Search, Browse the Repository Submit Update Policies Coordinator's Board Classification Scheme Credits Give us feedback Optimization Journals, Sites, Societies Optimization Online is supported by the Mathematical Optmization Society.