no code implementations • 17 Aug 2023 • Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Ksenia Konyushkova, Lotte Weerts, Abhishek Sharma, Aditya Siddhant, Alex Ahern, Miaosen Wang, Chenjie Gu, Wolfgang Macherey, Arnaud Doucet, Orhan Firat, Nando de Freitas
Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences.
no code implementations • 16 Jun 2023 • Gianluca Scarpellini, Ksenia Konyushkova, Claudio Fantacci, Tom Le Paine, Yutian Chen, Misha Denil
This paper describes $\pi2\text{vec}$, a method for representing behaviors of black box policies as feature vectors.
no code implementations • 13 Mar 2023 • Yuqing Du, Ksenia Konyushkova, Misha Denil, Akhil Raju, Jessica Landon, Felix Hill, Nando de Freitas, Serkan Cabi
Detecting successful behaviour is crucial for training intelligent agents.
no code implementations • 17 Feb 2022 • Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent SIfre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell
In this paper we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior.
1 code implementation • NeurIPS 2021 • Ksenia Konyushkova, Yutian Chen, Tom Le Paine, Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas
We use multiple benchmarks, including real-world robotics, with a large number of candidate policies to show that the proposed approach improves upon state-of-the-art OPE estimates and pure online policy evaluation.
no code implementations • 12 Dec 2020 • Ksenia Konyushkova, Konrad Zolna, Yusuf Aytar, Alexander Novikov, Scott Reed, Serkan Cabi, Nando de Freitas
In offline reinforcement learning (RL) agents are trained using a logged dataset.
no code implementations • 27 Nov 2020 • Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed
Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations.
1 code implementation • 26 Sep 2019 • Serkan Cabi, Sergio Gómez Colmenarejo, Alexander Novikov, Ksenia Konyushkova, Scott Reed, Rae Jeong, Konrad Zolna, Yusuf Aytar, David Budden, Mel Vecerik, Oleg Sushkov, David Barker, Jonathan Scholz, Misha Denil, Nando de Freitas, Ziyu Wang
We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions.
1 code implementation • ICLR 2019 • Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
We propose a general-purpose approach to discovering active learning (AL) strategies from data.
1 code implementation • CVPR 2018 • Ksenia Konyushkova, Jasper Uijlings, Christoph Lampert, Vittorio Ferrari
We demonstrate that (1) our agents are able to learn efficient annotation strategies in several scenarios, automatically adapting to the image difficulty, the desired quality of the boxes, and the detector strength; (2) in all scenarios the resulting annotation dialogs speed up annotation compared to manual box drawing alone and box verification alone, while also outperforming any fixed combination of verification and drawing in most scenarios; (3) in a realistic scenario where the detector is iteratively re-trained, our agents evolve a series of strategies that reflect the shifting trade-off between verification and drawing as the detector grows stronger.
1 code implementation • NeurIPS 2017 • Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
In this paper, we suggest a novel data-driven approach to active learning (AL).
no code implementations • 29 Jun 2016 • Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels or voxels are the most informative, and thus should to be annotated next.
no code implementations • 11 Nov 2015 • Ksenia Konyushkova, Nikolaos Arvanitopoulos, Zhargalma Dandarova Robert, Pierre-Yves Brandt, Sabine Süsstrunk
This paper introduces a novel approach to data analysis designed for the needs of specialists in psychology of religion.
no code implementations • ICCV 2015 • Ksenia Konyushkova, Raphael Sznitman, Pascal Fua
We propose an Active Learning approach to training a segmentation classifier that exploits geometric priors to streamline the annotation process in 3D image volumes.