Complex neural networks have no spurious local minima

1 Jan 2021  ·  Xingtu Liu ·

Most non-linear neural networks are known to have poor local minima (Yun et al. (2019)) and it is shown that training a neural network is NP-hard (Blum & Rivest (1988)). A line of work has studied the global optimality of neural networks in various settings but unfortunately all previous networks without spurious local minima are linear networks or networks with unrealistic assumptions. In this work we demonstrate for the first time that a non-linear neural network can have no poor local minima under no assumptions. Recently, a number of papers considered complex-valued neural networks (CVNNs) in various settings and suggest that CVNNs have competitive or even preferable behaviour compared to real-valued networks. Unfortunately, there is currently no theoretical analysis on the optimization of complex-valued networks, given that complex functions usually have a disparate optimization landscape. This is the first work towards analysing the optimization landscape of CVNNs. We prove a surprising result that no spurious local minima exist for one hidden layer complex-valued neural networks with quadratic activation. Since CVNNs can have real-valued datasets and there are no assumptions, our results are applicable to practical networks. Along the way, we develop a novel set of tools and techniques for analyzing the optimization of CVNNs, which may be useful in other contexts. Lastly, we prove spurious local minima exist for CVNNs with non-analytic CReLU activation.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods