1 code implementation • 8 Apr 2024 • David Valensi, Esther Derman, Shie Mannor, Gal Dalal
We show that given observed delay values, it is sufficient to perform a policy search in the class of Markov policies in order to reach optimal performance, thus extending the deterministic fixed delay case.