no code implementations • 13 Mar 2024 • Ang Li, Qiugen Xiao, Peng Cao, Jian Tang, Yi Yuan, Zijie Zhao, Xiaoyuan Chen, Liang Zhang, Xiangyang Li, Kaitong Yang, Weidong Guo, Yukang Gan, Xu Yu, Daniell Wang, Ying Shan
Using ChatGPT as a labeler to provide feedback on open-domain prompts in RLAIF training, we observe an increase in human evaluators' preference win ratio for model responses, but a decrease in evaluators' satisfaction rate.
no code implementations • 31 Dec 2023 • Chenyuan Yang, Zijie Zhao, Lingming Zhang
Bugs in operating system kernels can affect billions of devices and users all over the world.