Optimization Online


Over-Parameterized Deep Neural Networks Have No Strict Local Minima For Any Continuous Activations

Dawei Li (dawei2***at***illinois.edu)
Tian Ding (dt016***at***ie.cuhk.edu.hk)
Ruoyu Sun (ruoyus***at***illinois.edu)

Abstract: In this paper, we study the loss surface of the over-parameterized fully connected deep neural networks. We prove that for any continuous activation functions, the loss function has no bad strict local minimum, both in the regular sense and in the sense of sets. This result holds for any convex and differentiable loss function, and the data samples are only required to be distinct in at least one dimension. Furthermore, we show that bad local minima do exist for a class of activation functions, so without further assumptions it is impossible to prove every local minimum is a global minimum.

Keywords: non-convex optimization, landscape, deep neural netwoks, over-parameterization

Category 1: Nonlinear Optimization


Download: [PDF]

Entry Submitted: 11/22/2018
Entry Accepted: 11/22/2018
Entry Last Modified: 11/23/2018

Modify/Update this entry

  Visitors Authors More about us Links
  Subscribe, Unsubscribe
Digest Archive
Search, Browse the Repository


Coordinator's Board
Classification Scheme
Give us feedback
Optimization Journals, Sites, Societies
Mathematical Optimization Society