no code implementations • 24 Dec 2020 • Mahesh Chandra Mukkamala, Jalal Fadili, Peter Ochs
We fix this issue by proposing the MAP property, which generalizes the $L$-smad property and is also valid for a large class of nonconvex nonsmooth composite problems.
no code implementations • 8 Oct 2019 • Mahesh Chandra Mukkamala, Felix Westerkamp, Emanuel Laude, Daniel Cremers, Peter Ochs
This initiated the development of the Bregman proximal gradient (BPG) algorithm and an inertial variant (momentum based) CoCaIn BPG, which however rely on problem dependent Bregman distances.
2 code implementations • NeurIPS 2019 • Mahesh Chandra Mukkamala, Peter Ochs
Matrix Factorization is a popular non-convex optimization problem, for which alternating minimization schemes are mostly used.
2 code implementations • 6 Apr 2019 • Mahesh Chandra Mukkamala, Peter Ochs, Thomas Pock, Shoham Sabach
Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms.
no code implementations • ICLR 2019 • Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein
We identify a class of over-parameterized deep neural networks with standard activation functions and cross-entropy loss which provably have no bad local valley, in the sense that from any point in parameter space there exists a continuous path on which the cross-entropy loss is non-increasing and gets arbitrarily close to zero.
no code implementations • ICML 2018 • Quynh Nguyen, Mahesh Chandra Mukkamala, Matthias Hein
In the recent literature the important role of depth in deep learning has been emphasized.
no code implementations • ICML 2017 • Mahesh Chandra Mukkamala, Matthias Hein
Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks.