1 code implementation • 3 Jun 2024 • Zhenhua Liu, Tong Zhu, Chuanyuan Tan, Haonan Lu, Bing Liu, Wenliang Chen
Large Language Models (LLMs) have shown their impressive capabilities, while also raising concerns about the data contamination problems due to privacy issues and leakage of benchmark datasets in the pre-training phase.
1 code implementation • 14 May 2024 • Mengsong Wu, Tong Zhu, Han Han, Chuanyuan Tan, Xiang Zhang, Wenliang Chen
Therefore, Seal-Tools can serve as a new benchmark to evaluate the tool-calling ability of LLMs.
no code implementations • 23 May 2023 • Chuanyuan Tan, Yuehe Chen, Wenbiao Shao, Wenliang Chen
Question answering over knowledge bases (KBQA) aims to answer factoid questions with a given knowledge base (KB).