no code implementations • 21 Feb 2021 • Yixuan Liu, Hu Wang, Xiaowei Wang, Xiaoyue Sun, Liuyue Jiang, Minhui Xue
Therefore, a purify-trained classifier is designed to obtain the distribution and generate the calibrated rewards.