no code implementations • 2 Oct 2023 • Wei-Di Chang, Scott Fujimoto, David Meger, Gregory Dudek
Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert, using only observational data and without the direct guidance of demonstrated actions.
2 code implementations • NeurIPS 2023 • Scott Fujimoto, Wei-Di Chang, Edward J. Smith, Shixiang Shane Gu, Doina Precup, David Meger
In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked for environments with low-level states, such as physical control problems.
no code implementations • 19 May 2022 • Wei-Di Chang, Juan Camilo Gamboa Higuera, Scott Fujimoto, David Meger, Gregory Dudek
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only.
no code implementations • 28 Jan 2022 • Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu
In this work, we study the use of the Bellman equation as a surrogate objective for value prediction accuracy.
no code implementations • 29 Sep 2021 • Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu
In this work, we analyze the effectiveness of the Bellman equation as a proxy objective for value prediction accuracy in off-policy evaluation.
8 code implementations • NeurIPS 2021 • Scott Fujimoto, Shixiang Shane Gu
Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
1 code implementation • 12 Jun 2021 • Scott Fujimoto, David Meger, Doina Precup
We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.
no code implementations • 1 Jan 2021 • Scott Fujimoto, David Meger, Doina Precup
We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.
1 code implementation • NeurIPS 2020 • Scott Fujimoto, David Meger, Doina Precup
Prioritized Experience Replay (PER) is a deep reinforcement learning technique in which agents learn from transitions sampled with non-uniform probability proportionate to their temporal-difference error.
4 code implementations • 3 Oct 2019 • Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, Joelle Pineau
Widely-used deep reinforcement learning algorithms have been shown to fail in the batch setting--learning from a fixed data set without interaction with the environment.
1 code implementation • 31 Jan 2019 • Edward J. Smith, Scott Fujimoto, Adriana Romero, David Meger
Mesh models are a promising approach for encoding the structure of 3D objects.
Ranked #1 on 3D Object Reconstruction on Data3D−R2N2 (Avg F1 metric)
10 code implementations • 7 Dec 2018 • Scott Fujimoto, David Meger, Doina Precup
Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection.
2 code implementations • 1 Nov 2018 • Jason Gauci, Edoardo Conti, Yitao Liang, Kittipat Virochsiri, Yuchen He, Zachary Kaden, Vivek Narayanan, Xiaohui Ye, Zhengxing Chen, Scott Fujimoto
In this paper we present Horizon, Facebook's open source applied reinforcement learning (RL) platform.
no code implementations • 27 Sep 2018 • Scott Fujimoto, David Meger, Doina Precup
This work examines batch reinforcement learning--the task of maximally exploiting a given batch of off-policy data, without further data collection.
no code implementations • NAACL 2018 • Kian Kenyon-Dean, Eisha Ahmed, Scott Fujimoto, Jeremy Georges-Filteau, Christopher Glasz, Barleen Kaur, Lal, Auguste e, Bh, Shruti eri, Robert Belfer, Nirmal Kanagasabai, Roman Sarrazingendron, Rohit Verma, Derek Ruths
In datasets constructed for the purpose of Twitter sentiment analysis (TSA), these controversial examples can compose over 30{\%} of the originally annotated data.
3 code implementations • NeurIPS 2018 • Edward Smith, Scott Fujimoto, David Meger
We consider the problem of scaling deep generative shape models to high-resolution.
Ranked #2 on 3D Object Reconstruction on Data3D−R2N2 (Avg F1 metric)
67 code implementations • ICML 2018 • Scott Fujimoto, Herke van Hoof, David Meger
In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.
Ranked #2 on Continuous Control on Lunar Lander (OpenAI Gym)