no code implementations • 9 Sep 2021 • Donghwan Lee, Han-Dong Lim, Jihoon Park, Okyong Choi
Sutton, Szepesv\'{a}ri and Maei introduced the first gradient temporal-difference (GTD) learning algorithms compatible with both linear function approximation and off-policy training.