Spurious Local Minima Exist for Almost All Over-parameterized Neural Networks

Tian Ding(dt016***at***ie.cuhk.edu.hk)
Dawei Li(dawei2***at***illinois.edu)
Ruoyu Sun(ruoyus***at***illinois.edu)

Abstract: A popular belief for explaining the efficiency in training deep neural networks is that over-paramenterized neural networks have nice landscape. However, it still remains unclear whether over-parameterized neural networks contain spurious local minima in general, since all current positive results cannot prove non-existence of bad local minima, and all current negative results have strong restrictions to the activation functions, data samples or network architecture. In this paper we answer this question with a surprisingly negative result. In particular, we prove that for almost all deep over-parameterized non-linear neural networks, spurious local minima exist for generic input data samples. Our result helps give a more exact characterization of the landscape of deep neural networks and corrects a long-believed misunderstanding in the past decades.

Keywords: Over-parameterized neural networks; local minima; landscape; deep learning

Category 1: Global Optimization (Theory )

Category 2: Nonlinear Optimization (Other )


Entry Submitted: 10/04/2019
Entry Accepted: 10/04/2019
Entry Last Modified: 10/04/2019

