1 code implementation • 4 Apr 2024 • Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang
This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs.
1 code implementation • 2 Feb 2024 • Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su
Are these language agents capable of planning in more complex settings that are out of the reach of prior AI agents?
no code implementations • 31 Jan 2024 • Janice Ahn, Rishu Verma, Renze Lou, Di Liu, Rui Zhang, Wenpeng Yin
Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive capabilities of human intelligence.
1 code implementation • 5 Jan 2024 • Lin Sun, Kai Zhang, Qingyuan Li, Renze Lou
Multimodal information extraction (MIE) gains significant attention as the popularity of multimedia content increases.
no code implementations • 5 Dec 2023 • Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin
In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data.
1 code implementation • 4 Aug 2023 • Renze Lou, Wenpeng Yin
This work proposes a challenging yet more realistic setting for zero-shot cross-task generalization: zero-shot instruction following, presuming the existence of a paragraph-style task definition while no demonstrations exist.
1 code implementation • 22 May 2023 • Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su
By providing external information to large language models (LLMs), tool augmentation (including retrieval augmentation) has emerged as a promising solution for addressing the limitations of LLMs' static parametric memory.
1 code implementation • 18 Mar 2023 • Renze Lou, Kai Zhang, Wenpeng Yin
This survey paper tries to summarize and provide insights to the current research on instruction following, particularly, by answering the following questions: (i) What is task instruction, and what instruction types exist?
1 code implementation • 3 Mar 2023 • Xiaojie Gu, Renze Lou, Lin Sun, Shangxin Li
Conversational Causal Emotion Entailment (C2E2) is a task that aims at recognizing the causes corresponding to a target emotion in a conversation.
1 code implementation • 1 Jun 2022 • Yutong Wang, Renze Lou, Kai Zhang, MaoYan Chen, Yujiu Yang
To address these problems, in this work, we propose a novel learning framework named MORE (Metric learning-based Open Relation Extraction).
no code implementations • EMNLP 2021 • Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi
Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0. 33% to 17. 93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks.
no code implementations • ACL 2021 • Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi
Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments.