no code implementations • 6 Mar 2023 • Steffen Dereich, Sebastian Kassing
We prove existence of global minima in the loss landscape for the approximation of continuous target functions using shallow feedforward artificial neural networks with ReLU activation.
no code implementations • 28 Feb 2023 • Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
Many mathematical convergence results for gradient descent (GD) based algorithms employ the assumption that the GD process is (almost surely) bounded and, also in concrete numerical simulations, divergence of the GD process may slow down, or even completely rule out, convergence of the error function.
no code implementations • 16 Feb 2021 • Steffen Dereich, Sebastian Kassing
In this article, we consider convergence of stochastic gradient descent schemes (SGD), including momentum stochastic gradient descent (MSGD), under weak assumptions on the underlying landscape.