Search Results for author: Thomas Hubert

Found 10 papers, 7 papers with code

Optimizing Memory Mapping Using Deep Reinforcement Learning

no code implementations • 11 May 2023 • Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay, Ngân Vũ, Miaosen Wang, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Po-Sen Huang, Julian Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz, Kieran Milan, Oriol Vinyals, Daniel J. Mankowitz

We also introduce a Reinforcement Learning agent, mallocMuZero, and show that it is capable of playing this game to discover new and improved memory mapping solutions that lead to faster execution times on real ML workloads on ML accelerators.

Cloud Computing Decision Making +3

Paper
Add Code

Discovering faster matrix multiplication algorithms with reinforcement learning

2 code implementations • Nature 2022 • Alhussein Fawzi, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Francisco J. R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, David Silver, Demis Hassabis, Pushmeet Kohli

Particularly relevant is the case of 4 × 4 matrices in a finite field, where AlphaTensor’s algorithm improves on Strassen’s two-level algorithm for the first time, to our knowledge, since its discovery 50 years ago2.

reinforcement-learning Reinforcement Learning (RL)

8,006

Paper
Code

MuZero with Self-competition for Rate Control in VP9 Video Compression

no code implementations • 14 Feb 2022 • Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang, Flora Xue, Wendy Shang, Derek Pang, Rene Claus, Ching-Han Chiang, Cheng Chen, Jingning Han, Angie Chen, Daniel J. Mankowitz, Jackson Broshear, Julian Schrittwieser, Thomas Hubert, Oriol Vinyals, Timothy Mann

Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services.

Decision Making Quantization +1

Paper
Add Code

Competition-Level Code Generation with AlphaCode

1 code implementation • DeepMind 2022 • Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu, Oriol Vinyals

Programming is a powerful and ubiquitous problem-solving tool.

Ranked #1 on Code Generation on CodeContests

Code Generation

2,027

Paper
Code

Learning and Planning in Complex Action Spaces

1 code implementation • 13 Apr 2021 • Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver

Instead, only small subsets of actions can be sampled for the purpose of policy evaluation and improvement.

Ranked #1 on Continuous Control on acrobot.swingup

Continuous Control Game of Go

890

Paper
Code

Online and Offline Reinforcement Learning by Planning with a Learned Model

2 code implementations • NeurIPS 2021 • Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver

Combining Reanalyse with the MuZero algorithm, we introduce MuZero Unplugged, a single unified algorithm for any data budget, including offline RL.

Ranked #1 on Atari Games on Atari 2600 Bank Heist

Atari Games Continuous Control +4

Paper
Code

Monte-Carlo Tree Search as Regularized Policy Optimization

3 code implementations • ICML 2020 • Jean-bastien Grill, Florent Altché, Yunhao Tang, Thomas Hubert, Michal Valko, Ioannis Antonoglou, Rémi Munos

The combination of Monte-Carlo tree search (MCTS) with deep reinforcement learning has led to significant advances in artificial intelligence.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Approximate exploitability: Learning a best response in large games

no code implementations • 20 Apr 2020 • Finbarr Timbers, Nolan Bard, Edward Lockhart, Marc Lanctot, Martin Schmid, Neil Burch, Julian Schrittwieser, Thomas Hubert, Michael Bowling

In prior games research, agent evaluation often focused on the in-practice game outcomes.

Paper
Add Code

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

18 code implementations • 19 Nov 2019 • Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Ranked #1 on Atari Games on Atari 2600 Alien

Atari Games Atari Games 100k +3

2,390

Paper
Code

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

59 code implementations • 5 Dec 2017 • David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent SIfre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis

The game of chess is the most widely-studied domain in the history of artificial intelligence.

Ranked #1 on Game of Go on ELO Ratings

Game of Chess Game of Go +4

31,421

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.