no code implementations • 1 Jan 2021 • Namyong Kim, Hyunsuk Baek, Hayong Shin
Gradient-based policy search algorithms (such as PPO, SAC or TD3) in deep reinforcement learning (DRL) have shown successful results on a range of challenging control tasks.