1 code implementation • 12 Apr 2024 • Matteo Tucat, Anirbit Mukherjee
In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width.