no code implementations • 7 Feb 2024 • Natasha Butt, Blazej Manczak, Auke Wiggers, Corrado Rainone, David Zhang, Michaël Defferrard, Taco Cohen
Our method iterates between 1) program sampling and hindsight relabeling, and 2) learning from prioritized experience replay.
1 code implementation • 3 Nov 2023 • Blazej Manczak, Jan Viebahn, Herke van Hoof
Whereas at the highest level a purely rule-based policy is still chosen for all agents in this study, at the intermediate level the policy is trained using different state-of-the-art RL algorithms.
Hierarchical Reinforcement Learning reinforcement-learning +1