1 code implementation • 23 May 2024 • Yanrui Du, Sendong Zhao, Danyang Zhao, Ming Ma, Yuhan Chen, Liangyu Huo, Qing Yang, Dongliang Xu, Bing Qin
When encountering malicious instructions, the router will assign a higher weight to the safe LLM to ensure that responses are harmless.
1 code implementation • 8 Sep 2023 • Yanrui Du, Sendong Zhao, MuZhen Cai, Ming Ma, Danyang Zhao, Jiawei Cao, Bing Qin
We conduct several experiments to analyze the dual logic ability of LLMs by examining the consistency of the stance in responses to paired questions about the same fact.