1 code implementation • NeurIPS 2023 • Blake Bordelon, Paul Masset, Henry Kuo, Cengiz Pehlevan
We study how learning dynamics and plateaus depend on feature structure, learning rate, discount factor, and reward function.
reinforcement-learning