no code implementations • 19 Jan 2024 • Dayang Liang, Yaru Zhang, Yunlong Liu
As a result, our method is able to simultaneously achieve the full utilization of retrieval information and the better evaluation of state values by a Temporal Difference (TD) loss.