1 code implementation • NeurIPS 2023 • Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson
In this work, we conduct new analysis demonstrating that SMAC lacks the stochasticity and partial observability to require complex *closed-loop* policies.
no code implementations • 28 Oct 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How
By directly comparing active equilibria to Nash equilibria in these examples, we find that active equilibria find more effective solutions than Nash equilibria, concluding that an active equilibrium is the desired solution for multiagent learning settings.
no code implementations • 11 Oct 2022 • Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić
This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn the best preconditioner during optimization.
no code implementations • 20 Jul 2022 • Tim Franzmeyer, Stephen Mcaleer, João F. Henriques, Jakob N. Foerster, Philip H. S. Torr, Adel Bibi, Christian Schroeder de Witt
Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs.
no code implementations • NeurIPS 2021 • Brandon Cui, Hengyuan Hu, Luis Pineda, Jakob N. Foerster
The standard problem setting in cooperative multi-agent settings is self-play (SP), where the goal is to train a team of agents that works well together.
no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster
Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.
1 code implementation • 7 Mar 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How
An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and influence the evolution of future policies towards desirable behavior for its own benefit.
1 code implementation • 11 Feb 2020 • Dmitrii Beloborodov, A. E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky
Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization.
4 code implementations • ICLR 2020 • Hengyuan Hu, Jakob N. Foerster
Learning to be informative when observed by others is an interesting challenge for Reinforcement Learning (RL): Fundamentally, RL requires agents to explore in order to discover good policies.
2 code implementations • 23 Oct 2019 • Reda Bahi Slaoui, William R. Clements, Jakob N. Foerster, Sébastien Toth
Producing agents that can generalize to a wide range of visually different environments is a significant challenge in reinforcement learning.
no code implementations • 25 Sep 2019 • Reda Bahi Slaoui, William R. Clements, Jakob N. Foerster, Sébastien Toth
In this work, we formalize the domain randomization problem, and show that minimizing the policy's Lipschitz constant with respect to the randomization parameters leads to low variance in the learned policies.
2 code implementations • 9 Sep 2019 • Thomas D. Barrett, William R. Clements, Jakob N. Foerster, A. I. Lvovsky
Our approach of exploratory combinatorial optimization (ECO-DQN) is, in principle, applicable to any combinatorial problem that can be defined on a graph.
1 code implementation • 1 Feb 2019 • Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling
From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making.
1 code implementation • 4 Nov 2018 • Jakob N. Foerster, Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew Botvinick, Michael Bowling
We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment.
1 code implementation • NeurIPS 2019 • Christian A. Schroeder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson
In this paper, we show that common knowledge between agents allows for complex decentralised coordination.
Multi-agent Reinforcement Learning reinforcement-learning +3
6 code implementations • 13 Sep 2017 • Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch
We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.
no code implementations • ICML 2017 • Jakob N. Foerster, Justin Gilmer, Jan Chorowski, Jascha Sohl-Dickstein, David Sussillo
There exist many problem domains where the interpretability of neural network models is essential for deployment.
3 code implementations • NeurIPS 2016 • Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility.
no code implementations • 8 Feb 2016 • Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson
We propose deep distributed recurrent Q-networks (DDRQN), which enable teams of agents to learn to solve communication-based coordination tasks.