Search Results for author: Adrian Riekert

Found 12 papers, 0 papers with code

Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks

no code implementations7 Feb 2024 Arnulf Jentzen, Adrian Riekert

In this work we solve this research problem in the situation of shallow ANNs with the rectified linear unit (ReLU) and related activations with the standard mean square error loss by disproving in the training of such ANNs that SGD methods (such as the plain vanilla SGD, the momentum SGD, the AdaGrad, the RMSprop, and the Adam optimizers) can find a global minimizer with high probability.

Deep neural network approximation of composite functions without the curse of dimensionality

no code implementations12 Apr 2023 Adrian Riekert

In this article we identify a general class of high-dimensional continuous functions that can be approximated by deep neural networks (DNNs) with the rectified linear unit (ReLU) activation without the curse of dimensionality.

Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations

no code implementations7 Feb 2023 Arnulf Jentzen, Adrian Riekert, Philippe von Wurstemberger

The obtained ANN architectures and their initialization schemes are thus strongly inspired by numerical algorithms as well as by popular deep learning methodologies from the literature and in that sense we refer to the introduced ANNs in conjunction with their tailor-made initialization schemes as Algorithmically Designed Artificial Neural Networks (ADANNs).

Operator learning

Normalized gradient flow optimization in the training of ReLU artificial neural networks

no code implementations13 Jul 2022 Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg Weiss

The training of artificial neural networks (ANNs) is nowadays a highly relevant algorithmic procedure with many applications in science and industry.

On the existence of global minima and convergence analyses for gradient descent methods in the training of deep neural networks

no code implementations17 Dec 2021 Arnulf Jentzen, Adrian Riekert

In this article we study fully-connected feedforward deep ReLU ANNs with an arbitrarily large number of hidden layers and we prove convergence of the risk of the GD optimization method with random initializations in the training of such ANNs under the assumption that the unnormalized probability density function of the probability distribution of the input data of the considered supervised learning problem is piecewise polynomial, under the assumption that the target function (describing the relationship between input data and the output data) is piecewise polynomial, and under the assumption that the risk function of the considered supervised learning problem admits at least one regular global minimum.

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

no code implementations13 Dec 2021 Martin Hutzenthaler, Arnulf Jentzen, Katharina Pohl, Adrian Riekert, Luca Scarpa

In many numerical simulations stochastic gradient descent (SGD) type optimization methods perform very effectively in the training of deep neural networks (DNNs) but till this day it remains an open problem of research to provide a mathematical convergence analysis which rigorously explains the success of SGD type optimization methods in the training of DNNs.

Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation

no code implementations18 Aug 2021 Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg S. Weiss

In the second main result of this article we prove in the training of such ANNs under the assumption that the target function and the density function of the probability distribution of the input data are piecewise polynomial that every non-divergent GF trajectory converges with an appropriate rate of convergence to a critical point and that the risk of the non-divergent GF trajectory converges with rate 1 to the risk of the critical point.

A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions

no code implementations10 Aug 2021 Arnulf Jentzen, Adrian Riekert

Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains - even in the simplest situation of the plain vanilla GD optimization method with random initializations and ANNs with one hidden layer - an open problem to prove (or disprove) the conjecture that the risk of the GD optimization method converges in the training of such ANNs to zero as the width of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity.

Convergence analysis for gradient flows in the training of artificial neural networks with ReLU activation

no code implementations9 Jul 2021 Arnulf Jentzen, Adrian Riekert

Finally, in the special situation where there is only one neuron on the hidden layer (1-dimensional hidden layer) we strengthen the above named result for affine linear target functions by proving that that the risk of every (not necessarily bounded) GF trajectory converges to zero if the initial risk is sufficiently small.

A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions

no code implementations1 Apr 2021 Arnulf Jentzen, Adrian Riekert

In this article we study the stochastic gradient descent (SGD) optimization method in the training of fully-connected feedforward artificial neural networks with ReLU activation.

Strong overall error analysis for the training of artificial neural networks via random initializations

no code implementations15 Dec 2020 Arnulf Jentzen, Adrian Riekert

Although deep learning based approximation algorithms have been applied very successfully to numerous problems, at the moment the reasons for their performance are not entirely understood from a mathematical point of view.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.