1 code implementation • 26 Dec 2023 • Jingpu Yang, Helin Wang, Qirui Zhao, Zhecheng Shi, Zirui Song, Miao Fang
To address this, we have introduced an additional optimistic Actor to enhance the model's exploration ability, while employing a more constrained pessimistic Actor for performance evaluation.