no code implementations • 23 Aug 2023 • Luca Herranz-Celotti, Jean Rouat
However, analysing deep recurrent networks, we identify a new additive source of exponential explosion that emerges from counting gradient paths in a rectangular grid in depth and time.
no code implementations • 18 May 2023 • Luca Herranz-Celotti, Ermal Rrapaj
Our method outperforms existing techniques in terms of test loss while simultaneously halving the number of parameters.
no code implementations • 1 Feb 2022 • Luca Herranz-Celotti, Jean Rouat
We show how it can be used to reduce the need of extensive grid-search of dampening, sharpness and tail-fatness of the SG.