no code implementations • 27 May 2024 • Adriana Hugessen, Roger Creus Castanyer, Faisal Mohamed, Glen Berseth
In an effort to find a single entropy-based method that will encourage emergent behaviors in any environment, we propose an agent that can adapt its objective online, depending on the entropy conditions by framing the choice as a multi-armed bandit problem.
no code implementations • 4 Oct 2023 • Raj Ghugare, Santiago Miret, Adriana Hugessen, Mariano Phielipp, Glen Berseth
Reinforcement learning (RL) over text representations can be effective for finding high-value policies that can search over graphs.