no code implementations • 26 Feb 2024 • Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, Xingdi Yuan
We present an algorithm for skill discovery from expert demonstrations.
1 code implementation • NeurIPS 2023 • Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux
Thus, they can be seen as stochastic language layers in a language network, where the learnable parameters are the natural language prompts at each layer.
no code implementations • 11 Jun 2023 • Maryam Molamohammadi, Afaf Taik, Nicolas Le Roux, Golnoosh Farnadi
The growing utilization of machine learning (ML) in decision-making processes raises questions about its benefits to society.
1 code implementation • NeurIPS 2023 • Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux
Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.
1 code implementation • 6 Feb 2023 • Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux
Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.
1 code implementation • NeurIPS 2023 • Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni
We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation and propose $\texttt{MHR}$-$\mu$, which discards routing and fine-tunes the average of the pre-trained adapters on each downstream tasks.
1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux
Common policy gradient methods rely on the maximization of a sequence of surrogate functions.
no code implementations • ICCV 2021 • Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Rob Romijnders, Nicolas Le Roux, Ross Goroshin
We investigate the impact of aliasing on generalization in Deep Convolutional Networks and show that data augmentation schemes alone are unable to prevent it due to structural limitations in widely used architectures.
no code implementations • 30 Jun 2021 • Chris Junchi Li, Yaodong Yu, Nicolas Loizou, Gauthier Gidel, Yi Ma, Nicolas Le Roux, Michael I. Jordan
We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) method with constant step size, and presenting variations of the method that yield favorable convergence.
1 code implementation • 17 Feb 2021 • Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux
We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.
no code implementations • ICLR 2021 • Yijie Guo, Shengyu Feng, Nicolas Le Roux, Ed Chi, Honglak Lee, Minmin Chen
Many real-world applications of reinforcement learning (RL) require the agent to learn from a fixed set of trajectories, without collecting new interactions.
no code implementations • 20 Nov 2020 • Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Nicolas Le Roux, Ross Goroshin
Image pre-processing in the frequency domain has traditionally played a vital role in computer vision and was even part of the standard pipeline in the early days of deep learning.
no code implementations • 31 Aug 2020 • Wesley Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux
Traditionally, stochastic optimization theory predicts that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates.
no code implementations • NeurIPS 2020 • Dibya Ghosh, Marlos C. Machado, Nicolas Le Roux
We cast policy gradient methods as the repeated application of two operators: a policy improvement operator $\mathcal{I}$, which maps any policy $\pi$ to a better one $\mathcal{I}\pi$, and a projection operator $\mathcal{P}$, which finds the best approximation of $\mathcal{I}\pi$ in the set of realizable policies.
no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux
For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.
no code implementations • ICLR 2020 • Lukas Balles, Fabian Pedregosa, Nicolas Le Roux
Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training.
no code implementations • 18 Jun 2019 • Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux
The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.
1 code implementation • NeurIPS 2019 • Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux
While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting.
no code implementations • ICLR 2019 • Akram Erraqabi, Nicolas Le Roux
Wilson et al. (2017) showed that, when the stepsize schedule is properly designed, stochastic gradient generalizes better than ADAM (Kingma & Ba, 2014).
no code implementations • ICLR 2019 • Utku Evci, Nicolas Le Roux, Pablo Castro, Leon Bottou
Finally, we show that the units selected by the best performing scoring functions are somewhat consistent over the course of training, implying the dead parts of the network appear during the stages of training.
no code implementations • 13 Feb 2019 • Nicolas Le Roux
Tail averaging consists in averaging the last examples in a stream.
no code implementations • 8 Feb 2019 • Marc G. Bellemare, Nicolas Le Roux, Pablo Samuel Castro, Subhodeep Moitra
Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.
Distributional Reinforcement Learning reinforcement-learning +1
no code implementations • 6 Feb 2019 • Guillaume Alain, Nicolas Le Roux, Pierre-Antoine Manzagol
The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research.
no code implementations • NeurIPS 2019 • Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle
We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks.
no code implementations • 31 Jan 2019 • Robert Dadashi, Adrien Ali Taïga, Nicolas Le Roux, Dale Schuurmans, Marc G. Bellemare
We establish geometric and topological properties of the space of value functions in finite state-action Markov decision processes.
1 code implementation • 27 Nov 2018 • Zafarali Ahmed, Nicolas Le Roux, Mohammad Norouzi, Dale Schuurmans
Entropy regularization is commonly used to improve policy optimization in reinforcement learning.
no code implementations • ICLR 2018 • Damien Vincent, Sylvain Gelly, Nicolas Le Roux, Olivier Bousquet
We propose an efficient online hyperparameter optimization method which uses a joint dynamical system to evaluate the gradient with respect to the hyperparameters.
1 code implementation • 20 Oct 2017 • Robert M. Gower, Nicolas Le Roux, Francis Bach
Our goal is to improve variance reducing stochastic methods through better control variates.
no code implementations • 29 May 2017 • Clément Calauzènes, Nicolas Le Roux
In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate.
no code implementations • 3 Apr 2017 • Thomas Nedelec, Nicolas Le Roux, Vianney Perchet
We provide a comparative study of several widely used off-policy estimators (Empirical Average, Basic Importance Sampling and Normalized Importance Sampling), detailing the different regimes where they are individually suboptimal.
no code implementations • 28 Dec 2016 • Nicolas Le Roux
We tackle the issue of finding a good policy when the number of policy updates is limited.
no code implementations • 29 Jun 2016 • Nicolas Le Roux
Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems.
2 code implementations • 10 Sep 2013 • Mark Schmidt, Nicolas Le Roux, Francis Bach
Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations.