no code implementations • 17 Sep 2023 • Pulkit Gopalani, Samyak Jha, Anirbit Mukherjee
In this note, we demonstrate a first-of-its-kind provable convergence of SGD to the global minima of appropriately regularized logistic empirical risk of depth $2$ nets -- for arbitrary data and with any number of gates with adequately smooth and bounded activations like sigmoid and tanh.