Search Results for author: Achraf Bahamou

Found 7 papers, 1 papers with code

Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

no code implementations • 23 May 2023 • Achraf Bahamou, Donald Goldfarb

We propose a new per-layer adaptive step-size procedure for stochastic first-order optimization methods for minimizing empirical loss functions in deep learning, eliminating the need for the user to tune the learning rate (LR).

Paper
Add Code

A Mini-Block Fisher Method for Deep Neural Networks

no code implementations • 8 Feb 2022 • Achraf Bahamou, Donald Goldfarb, Yi Ren

Specifically, our method uses a block-diagonal approximation to the empirical Fisher matrix, where for each layer in the DNN, whether it is convolutional or feed-forward and fully connected, the associated diagonal block is itself block-diagonal and is composed of a large number of mini-blocks of modest size.

Second-order methods

Paper
Add Code

Optimal Pricing with a Single Point

no code implementations • 9 Mar 2021 • Amine Allouah, Achraf Bahamou, Omar Besbes

For settings where the seller knows the exact probability of sale associated with one historical price or only a confidence interval for it, we fully characterize optimal performance and near-optimal pricing algorithms that adjust to the information at hand.

Computer Science and Game Theory Information Theory Information Theory

Paper
Add Code

Kronecker-factored Quasi-Newton Methods for Deep Learning

no code implementations • 12 Feb 2021 • Yi Ren, Achraf Bahamou, Donald Goldfarb

Several improvements to the methods in Goldfarb et al. (2020) are also proposed that can be applied to both MLPs and CNNs.

Second-order methods

Paper
Add Code

Practical Quasi-Newton Methods for Training Deep Neural Networks

1 code implementation • NeurIPS 2020 • Donald Goldfarb, Yi Ren, Achraf Bahamou

We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs).

Paper
Code

Stochastic Flows and Geometric Optimization on the Orthogonal Group

no code implementations • ICML 2020 • Krzysztof Choromanski, David Cheikhi, Jared Davis, Valerii Likhosherstov, Achille Nazaret, Achraf Bahamou, Xingyou Song, Mrugank Akarte, Jack Parker-Holder, Jacob Bergquist, Yuan Gao, Aldo Pacchiano, Tamas Sarlos, Adrian Weller, Vikas Sindhwani

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group $O(d)$ and naturally reductive homogeneous manifolds obtained from the action of the rotation group $SO(d)$.

Metric Learning Stochastic Optimization

Paper
Add Code

A Dynamic Sampling Adaptive-SGD Method for Machine Learning

no code implementations • 31 Dec 2019 • Achraf Bahamou, Donald Goldfarb

We also propose an adaptive version of ADAM that eliminates the need to tune the base learning rate and compares favorably to fine-tuned ADAM on training DNNs.

BIG-bench Machine Learning Stochastic Optimization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.