Search Results for author: Nicolas Le Roux

Found 33 papers, 10 papers with code

Language-guided Skill Learning with Temporal Variational Inference

no code implementations • 26 Feb 2024 • Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, Xingdi Yuan

We present an algorithm for skill discovery from expert demonstrations.

Segmentation Variational Inference

Paper
Add Code

Joint Prompt Optimization of Stacked LLMs using Variational Inference

1 code implementation • NeurIPS 2023 • Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux

Thus, they can be seen as stochastic language layers in a language network, where the learnable parameters are the natural language prompts at each layer.

Natural Language Understanding Variational Inference

Paper
Code

Unraveling the Interconnected Axes of Heterogeneity in Machine Learning for Democratic and Inclusive Advancements

no code implementations • 11 Jun 2023 • Maryam Molamohammadi, Afaf Taik, Nicolas Le Roux, Golnoosh Farnadi

The growing utilization of machine learning (ML) in decision-making processes raises questions about its benefits to society.

Decision Making

Paper
Add Code

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

1 code implementation • NeurIPS 2023 • Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux

Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.

Reinforcement Learning (RL)

Paper
Code

Target-based Surrogates for Stochastic Optimization

1 code implementation • 6 Feb 2023 • Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux

Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.

Imitation Learning Stochastic Optimization

Paper
Code

Multi-Head Adapter Routing for Cross-Task Generalization

1 code implementation • NeurIPS 2023 • Lucas Caccia, Edoardo Ponti, Zhan Su, Matheus Pereira, Nicolas Le Roux, Alessandro Sordoni

We find that routing is most beneficial during multi-task pre-training rather than during few-shot adaptation and propose $\texttt{MHR}$-$\mu$, which discards routing and fine-tunes the average of the pre-trained adapters on each downstream tasks.

Paper
Code

A general class of surrogate functions for stable and efficient reinforcement learning

1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux

Common policy gradient methods rely on the maximization of a sequence of surrogate functions.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

Impact of Aliasing on Generalization in Deep Convolutional Networks

no code implementations • ICCV 2021 • Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Rob Romijnders, Nicolas Le Roux, Ross Goroshin

We investigate the impact of aliasing on generalization in Deep Convolutional Networks and show that data augmentation schemes alone are unable to prevent it due to structural limitations in widely used architectures.

Data Augmentation Few-Shot Learning +1

Paper
Add Code

On the Convergence of Stochastic Extragradient for Bilinear Games using Restarted Iteration Averaging

no code implementations • 30 Jun 2021 • Chris Junchi Li, Yaodong Yu, Nicolas Loizou, Gauthier Gidel, Yi Ma, Nicolas Le Roux, Michael I. Jordan

We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) method with constant step size, and presenting variations of the method that yield favorable convergence.

Paper
Add Code

Bridging the Gap Between Adversarial Robustness and Optimization Bias

1 code implementation • 17 Feb 2021 • Fartash Faghri, Sven Gowal, Cristina Vasconcelos, David J. Fleet, Fabian Pedregosa, Nicolas Le Roux

We demonstrate that the choice of optimizer, neural network architecture, and regularizer significantly affect the adversarial robustness of linear neural networks, providing guarantees without the need for adversarial training.

Adversarial Robustness

Paper
Code

Batch Reinforcement Learning Through Continuation Method

no code implementations • ICLR 2021 • Yijie Guo, Shengyu Feng, Nicolas Le Roux, Ed Chi, Honglak Lee, Minmin Chen

Many real-world applications of reinforcement learning (RL) require the agent to learn from a fixed set of trajectories, without collecting new interactions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

An Effective Anti-Aliasing Approach for Residual Networks

no code implementations • 20 Nov 2020 • Cristina Vasconcelos, Hugo Larochelle, Vincent Dumoulin, Nicolas Le Roux, Ross Goroshin

Image pre-processing in the frequency domain has traditionally played a vital role in computer vision and was even part of the standard pipeline in the early days of deep learning.

Few-Shot Learning Image Classification +1

Paper
Add Code

Beyond variance reduction: Understanding the true impact of baselines on policy optimization

no code implementations • 31 Aug 2020 • Wesley Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Traditionally, stochastic optimization theory predicts that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates.

Reinforcement Learning (RL) Stochastic Optimization

Paper
Add Code

An operator view of policy gradient methods

no code implementations • NeurIPS 2020 • Dibya Ghosh, Marlos C. Machado, Nicolas Le Roux

We cast policy gradient methods as the repeated application of two operators: a policy improvement operator $\mathcal{I}$, which maps any policy $\pi$ to a better one $\mathcal{I}\pi$, and a projection operator $\mathcal{P}$, which finds the best approximation of $\mathcal{I}\pi$ in the set of realizable policies.

Policy Gradient Methods

Paper
Add Code

To Each Optimizer a Norm, To Each Norm its Generalization

no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux

For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.

Classification General Classification

Paper
Add Code

The Geometry of Sign Gradient Descent

no code implementations • ICLR 2020 • Lukas Balles, Fabian Pedregosa, Nicolas Le Roux

Sign-based optimization methods have become popular in machine learning due to their favorable communication cost in distributed optimization and their surprisingly good performance in neural network training.

Distributed Optimization

Paper
Add Code

On the interplay between noise and curvature and its effect on optimization and generalization

no code implementations • 18 Jun 2019 • Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.

Paper
Add Code

Reducing the variance in online optimization by transporting past gradients

1 code implementation • NeurIPS 2019 • Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux

While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting.

Stochastic Optimization

Paper
Code

Combining adaptive algorithms and hypergradient method: a performance and robustness study

no code implementations • ICLR 2019 • Akram Erraqabi, Nicolas Le Roux

Wilson et al. (2017) showed that, when the stepsize schedule is properly designed, stochastic gradient generalizes better than ADAM (Kingma & Ba, 2014).

Paper
Add Code

Mean Replacement Pruning

no code implementations • ICLR 2019 • Utku Evci, Nicolas Le Roux, Pablo Castro, Leon Bottou

Finally, we show that the units selected by the best performing scoring functions are somewhat consistent over the course of training, implying the dead parts of the network appear during the stages of training.

Paper
Add Code

Anytime Tail Averaging

no code implementations • 13 Feb 2019 • Nicolas Le Roux

Tail averaging consists in averaging the last examples in a stream.

Paper
Add Code

Distributional reinforcement learning with linear function approximation

no code implementations • 8 Feb 2019 • Marc G. Bellemare, Nicolas Le Roux, Pablo Samuel Castro, Subhodeep Moitra

Despite many algorithmic advances, our theoretical understanding of practical distributional reinforcement learning methods remains limited.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Negative eigenvalues of the Hessian in deep neural networks

no code implementations • 6 Feb 2019 • Guillaume Alain, Nicolas Le Roux, Pierre-Antoine Manzagol

The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research.

Paper
Add Code

A Geometric Perspective on Optimal Representations for Reinforcement Learning

no code implementations • NeurIPS 2019 • Marc G. Bellemare, Will Dabney, Robert Dadashi, Adrien Ali Taiga, Pablo Samuel Castro, Nicolas Le Roux, Dale Schuurmans, Tor Lattimore, Clare Lyle

We leverage this perspective to provide formal evidence regarding the usefulness of value functions as auxiliary tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

The Value Function Polytope in Reinforcement Learning

no code implementations • 31 Jan 2019 • Robert Dadashi, Adrien Ali Taïga, Nicolas Le Roux, Dale Schuurmans, Marc G. Bellemare

We establish geometric and topological properties of the space of value functions in finite state-action Markov decision processes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Understanding the impact of entropy on policy optimization

1 code implementation • 27 Nov 2018 • Zafarali Ahmed, Nicolas Le Roux, Mohammad Norouzi, Dale Schuurmans

Entropy regularization is commonly used to improve policy optimization in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Online Hyper-Parameter Optimization

no code implementations • ICLR 2018 • Damien Vincent, Sylvain Gelly, Nicolas Le Roux, Olivier Bousquet

We propose an efficient online hyperparameter optimization method which uses a joint dynamical system to evaluate the gradient with respect to the hyperparameters.

Hyperparameter Optimization

Paper
Add Code

Tracking the gradients using the Hessian: A new look at variance reducing stochastic methods

1 code implementation • 20 Oct 2017 • Robert M. Gower, Nicolas Le Roux, Francis Bach

Our goal is to improve variance reducing stochastic methods through better control variates.

Paper
Code

Distributed SAGA: Maintaining linear convergence rate with limited communication

no code implementations • 29 May 2017 • Clément Calauzènes, Nicolas Le Roux

In recent years, variance-reducing stochastic methods have shown great practical performance, exhibiting linear convergence rate when other stochastic methods offered a sub-linear rate.

Paper
Add Code

A comparative study of counterfactual estimators

no code implementations • 3 Apr 2017 • Thomas Nedelec, Nicolas Le Roux, Vianney Perchet

We provide a comparative study of several widely used off-policy estimators (Empirical Average, Basic Importance Sampling and Normalized Importance Sampling), detailing the different regimes where they are individually suboptimal.

counterfactual

Paper
Add Code

Efficient iterative policy optimization

no code implementations • 28 Dec 2016 • Nicolas Le Roux

We tackle the issue of finding a good policy when the number of policy updates is limited.

Paper
Add Code

Tighter bounds lead to improved classifiers

no code implementations • 29 Jun 2016 • Nicolas Le Roux

Updating the upper bound during the optimization leads to improved classification rates while transforming the learning into a sequence of minimization problems.

Classification General Classification

Paper
Add Code

Minimizing Finite Sums with the Stochastic Average Gradient

2 code implementations • 10 Sep 2013 • Mark Schmidt, Nicolas Le Roux, Francis Bach

Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.