Search Results for author: Hainiu Xu

Found 8 papers, 4 papers with code

RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

no code implementations • 13 May 2024 • Liam Dugan, Alyssa Hwang, Filip Trhlik, Josh Magnus Ludan, Andrew Zhu, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch

However, very few of these detectors are evaluated on shared benchmark datasets and even when they are, the datasets used for evaluation are insufficiently challenging -- lacking variations in sampling strategy, adversarial attacks, and open-source generative models.

Adversarial Robustness Text Detection

Paper
Add Code

Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond

no code implementations • 22 Feb 2024 • Xinyu Wang, Hainiu Xu, Lin Gui, Yulan He

Task embedding, a meta-learning technique that captures task-specific information, has become prevalent, especially in areas such as multi-task learning, model editing, and interpretability.

Meta-Learning Model Editing +1

Paper
Add Code

Large Language Models Fall Short: Understanding Complex Relationships in Detective Narratives

no code implementations • 16 Feb 2024 • Runcong Zhao, Qinglin Zhu, Hainiu Xu, Jiazheng Li, Yuxiang Zhou, Yulan He, Lin Gui

Existing datasets for narrative understanding often fail to represent the complexity and uncertainty of relationships in real-life social scenarios.

Paper
Add Code

OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models

1 code implementation • 8 Feb 2024 • Hainiu Xu, Runcong Zhao, Lixing Zhu, Jinhua Du, Yulan He

Neural Theory-of-Mind (N-ToM), machine's ability to understand and keep track of the mental states of others, is pivotal in developing socially intelligent agents.

Paper
Code

OpenPI2.0: An Improved Dataset for Entity Tracking in Texts

1 code implementation • 24 May 2023 • Li Zhang, Hainiu Xu, Abhinav Kommula, Chris Callison-Burch, Niket Tandon

An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text.

Question Answering

Paper
Code

Exploring the Curious Case of Code Prompts

1 code implementation • 26 Apr 2023 • Li Zhang, Liam Dugan, Hainiu Xu, Chris Callison-Burch

Furthermore, we show that the style of code prompt has a large effect on performance for some but not all tasks and that fine-tuning on text instructions leads to better relative performance of code prompts.

Paper
Code

Human-in-the-Loop Schema Induction

no code implementations • 25 Feb 2023 • Tianyi Zhang, Isaac Tham, Zhaoyi Hou, Jiaxuan Ren, Liyang Zhou, Hainiu Xu, Li Zhang, Lara J. Martin, Rotem Dror, Sha Li, Heng Ji, Martha Palmer, Susan Brown, Reece Suchocki, Chris Callison-Burch

Schema induction builds a graph representation explaining how events unfold in a scenario.

Information Retrieval Retrieval

Paper
Add Code

Causal Reasoning of Entities and Events in Procedural Texts

1 code implementation • 26 Jan 2023 • Li Zhang, Hainiu Xu, Yue Yang, Shuyan Zhou, Weiqiu You, Manni Arora, Chris Callison-Burch

By injecting the causal relations between entities and events as intermediate reasoning steps in our representation, we further boost the performance to . 67 F1.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.