  


Beneath the valley of the noncommutative arithmeticgeometric mean inequality: conjectures, casestudies, and consequences
Benjamin Recht(brechtcs.wisc.edu) Abstract: Randomized algorithms that base iterationlevel decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with and withoutreplacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmeticgeometric mean inequality that would prove that the expected convergence rate of withoutreplacement sampling is faster than that of withreplacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worstcase bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems. Keywords: Matrix Inequalities. Randomized algorithms. Random matrices. Stochastic gradient descent. Incremental gradient descent. Category 1: Nonlinear Optimization Citation: Download: [PDF] Entry Submitted: 02/19/2012 Modify/Update this entry  
Visitors  Authors  More about us  Links  
Subscribe, Unsubscribe Digest Archive Search, Browse the Repository

Submit Update Policies 
Coordinator's Board Classification Scheme Credits Give us feedback 
Optimization Journals, Sites, Societies  