no code implementations • 7 Mar 2021 • Abi Komanduru, Jean Honorio
Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP).
1 code implementation • NeurIPS 2019 • Abi Komanduru, Jean Honorio
The paper further analyzes the proposed formulation of inverse reinforcement learning with $n$ states and $k$ actions, and shows a sample complexity of $O(n^2 \log (nk))$ for recovering a reward function that generates a policy that satisfies Bellman's optimality condition with respect to the true transition probabilities.