1 code implementation • 7 Jun 2023 • Michael Giegrich, Roel Oomen, Christoph Reisinger
In this paper, we propose a novel $K$-nearest neighbor resampling procedure for estimating the performance of a policy from historical data containing realized episodes of a decision process generated under a different policy.
no code implementations • 1 Nov 2022 • Michael Giegrich, Christoph Reisinger, Yufei Zhang
We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems.