Search Results for author: Renze Lou

Found 12 papers, 8 papers with code

Evaluating LLMs at Detecting Errors in LLM Responses

1 code implementation • 4 Apr 2024 • Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang

This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs.

Instruction Following

Paper
Code

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

1 code implementation • 2 Feb 2024 • Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, Yu Su

Are these language agents capable of planning in more complex settings that are out of the reach of prior AI agents?

139

Paper
Code

Large Language Models for Mathematical Reasoning: Progresses and Challenges

no code implementations • 31 Jan 2024 • Janice Ahn, Rishu Verma, Renze Lou, Di Liu, Rui Zhang, Wenpeng Yin

Mathematical reasoning serves as a cornerstone for assessing the fundamental cognitive capabilities of human intelligence.

Math Mathematical Reasoning

Paper
Add Code

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

1 code implementation • 5 Jan 2024 • Lin Sun, Kai Zhang, Qingyuan Li, Renze Lou

Multimodal information extraction (MIE) gains significant attention as the popularity of multimedia content increases.

Paper
Code

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

no code implementations • 5 Dec 2023 • Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data.

Instruction Following

Paper
Add Code

Toward Zero-Shot Instruction Following

1 code implementation • 4 Aug 2023 • Renze Lou, Wenpeng Yin

This work proposes a challenging yet more realistic setting for zero-shot cross-task generalization: zero-shot instruction following, presuming the existence of a paragraph-style task definition while no demonstrations exist.

Instruction Following

Paper
Code

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

1 code implementation • 22 May 2023 • Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su

By providing external information to large language models (LLMs), tool augmentation (including retrieval augmentation) has emerged as a promising solution for addressing the limitations of LLMs' static parametric memory.

Retrieval

Paper
Code

Large Language Model Instruction Following: A Survey of Progresses and Challenges

1 code implementation • 18 Mar 2023 • Renze Lou, Kai Zhang, Wenpeng Yin

This survey paper tries to summarize and provide insights to the current research on instruction following, particularly, by answering the following questions: (i) What is task instruction, and what instruction types exist?

Instruction Following

413

Paper
Code

PAGE: A Position-Aware Graph-Based Model for Emotion Cause Entailment in Conversation

1 code implementation • 3 Mar 2023 • Xiaojie Gu, Renze Lou, Lin Sun, Shangxin Li

Conversational Causal Emotion Entailment (C2E2) is a task that aims at recognizing the causes corresponding to a target emotion in a conversation.

Causal Emotion Entailment Causal Inference +1

Paper
Code

MORE: A Metric Learning Based Framework for Open-domain Relation Extraction

1 code implementation • 1 Jun 2022 • Yutong Wang, Renze Lou, Kai Zhang, MaoYan Chen, Yujiu Yang

To address these problems, in this work, we propose a novel learning framework named MORE (Metric learning-based Open Relation Extraction).

Clustering Metric Learning +2

Paper
Code

GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks

no code implementations • EMNLP 2021 • Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi

Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0. 33% to 17. 93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks.

Multi-Task Learning Natural Language Understanding

Paper
Add Code

Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks

no code implementations • ACL 2021 • Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi

Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments.

XLM-R

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.