no code implementations • 5 Mar 2018 • Mor Shpigel Nacson, Jason D. Lee, Suriya Gunasekar, Pedro H. P. Savarese, Nathan Srebro, Daniel Soudry
We show that for a large family of super-polynomial tailed losses, gradient descent iterates on linear networks of any depth converge in the direction of $L_2$ maximum-margin solution, while this does not hold for losses with heavier tails.
1 code implementation • 22 Nov 2017 • Pedro H. P. Savarese, Mayank Kakodkar, Bruno Ribeiro
We propose a Las Vegas transformation of Markov Chain Monte Carlo (MCMC) estimators of Restricted Boltzmann Machines (RBMs).
no code implementations • 4 Nov 2016 • Pedro H. P. Savarese, Leonardo O. Mazza, Daniel R. Figueiredo
We evaluate our method on MNIST using fully-connected networks, showing empirical indications that our augmentation facilitates the optimization of deep models, and that it provides high tolerance to full layer removal: the model retains over 90% of its performance even after half of its layers have been randomly removed.
Ranked #108 on Image Classification on CIFAR-10