2 code implementations • 12 Jan 2023 • Matko Bošnjak, Pierre H. Richemond, Nenad Tomasev, Florian Strub, Jacob C. Walker, Felix Hill, Lars Holger Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic
We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations.
no code implementations • 1 Jan 2021 • Giambattista Parascandolo, Lars Holger Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber
are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.
no code implementations • 1 Jan 2021 • Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Marcus Hutter, Lars Holger Buesing, Remi Munos
Credit assignment in reinforcement learning is the problem of measuring an action’s influence on future rewards.