no code implementations • 8 May 2023 • André Correia, Luís Alexandre
We propose a task-agnostic method that leverages small sets of safe and unsafe demonstrations to improve the safety of RL agents during learning.