Search Results for author: Alan Ritter

Found 61 papers, 31 papers with code

NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms

no code implementations • 19 Feb 2024 • Jonathan Zheng, Alan Ritter, Wei Xu

The performance of Large Language Models (LLMs) degrades from the temporal drift between data used for model training and newer text seen during inference.

Machine Translation Natural Language Understanding +1

Paper
Add Code

Stanceosaurus 2.0: Classifying Stance Towards Russian and Spanish Misinformation

no code implementations • 6 Feb 2024 • Anton Lavrouk, Ian Ligon, Tarek Naous, Jonathan Zheng, Alan Ritter, Wei Xu

The Stanceosaurus corpus (Zheng et al., 2022) was designed to provide high-quality, annotated, 5-way stance data extracted from Twitter, suitable for analyzing cross-cultural and cross-lingual misinformation.

Misinformation Stance Classification +1

Paper
Add Code

Constrained Decoding for Cross-lingual Label Projection

1 code implementation • 5 Feb 2024 • Duong Minh Le, Yang Chen, Alan Ritter, Wei Xu

Therefore, it is common to exploit translation and label projection to further improve the performance by (1) translating training data that is available in a high-resource language (e. g., English) together with the gold labels into low-resource languages, and/or (2) translating test data in low-resource languages to a high-source language to run inference on, then projecting the predicted span-level labels back onto the original test data.

Event Argument Extraction named-entity-recognition +5

Paper
Code

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers

no code implementations • 28 Nov 2023 • Cong Wei, Yang Chen, Haonan Chen, Hexiang Hu, Ge Zhang, Jie Fu, Alan Ritter, Wenhu Chen

Existing information retrieval (IR) models often assume a homogeneous format, limiting their applicability to diverse user needs, such as searching for images with text descriptions, searching for a news article with a headline image, or finding a similar photo with a query image.

Benchmarking Information Retrieval +2

Paper
Add Code

Reducing Privacy Risks in Online Self-Disclosures with Language Models

no code implementations • 16 Nov 2023 • Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu

Motivated by the user feedback, we introduce the task of self-disclosure abstraction, which is paraphrasing disclosures into less specific terms while preserving their utility, e. g., "Im 16F" to "I'm a teenage girl".

Language Modelling

Paper
Add Code

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

no code implementations • 2 Nov 2023 • Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrell, Alan Ritter, Stuart Russell

Our benchmark results show that many models are vulnerable to the attack strategies in the Tensor Trust dataset.

Instruction Following

Paper
Add Code

Can Language Models be Instructed to Protect Personal Information?

no code implementations • 3 Oct 2023 • Yang Chen, Ethan Mendes, Sauvik Das, Wei Xu, Alan Ritter

While data leaks should be prevented, it is also crucial to examine the trade-off between the privacy protection and model utility of proposed approaches.

Adversarial Robustness

Paper
Add Code

Self-Specialization: Uncovering Latent Expertise within Large Language Models

no code implementations • 29 Sep 2023 • Junmo Kang, Hongyin Luo, Yada Zhu, James Glass, David Cox, Alan Ritter, Rogerio Feris, Leonid Karlinsky

Recent works have demonstrated the effectiveness of self-alignment in which a large language model is, by itself, aligned to follow general instructions through the automatic generation of instructional data using a handful of human-written seeds.

Hallucination Instruction Following +2

Paper
Add Code

Improved Instruction Ordering in Recipe-Grounded Conversation

1 code implementation • 26 May 2023 • Duong Minh Le, Ruohao Guo, Wei Xu, Alan Ritter

In this paper, we study the task of instructional dialogue and focus on the cooking domain.

Intent Detection Response Generation

Paper
Code

Instruction Tuning with Lexicons for Zero-Shot Style Classification

no code implementations • 24 May 2023 • Ruohao Guo, Wei Xu, Alan Ritter

Style is used to convey authors' intentions and attitudes.

Classification

Paper
Add Code

Having Beer after Prayer? Measuring Cultural Bias in Large Language Models

no code implementations • 23 May 2023 • Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu

In this paper, we show that multilingual and Arabic monolingual LMs exhibit bias towards entities associated with Western culture.

named-entity-recognition Named Entity Recognition +4

Paper
Add Code

Schema-Driven Information Extraction from Heterogeneous Tables

1 code implementation • 23 May 2023 • Fan Bai, Junmo Kang, Gabriel Stanovsky, Dayne Freitag, Alan Ritter

We use this collection of annotated tables to evaluate the ability of open-source and API-based language models to extract information from tables covering diverse domains and data formats.

Ranked #1 on Attribute Extraction on SWDE

Attribute Extraction Instruction Following +1

Paper
Code

Better Low-Resource Entity Recognition Through Translation and Annotation Fusion

1 code implementation • 23 May 2023 • Yang Chen, Vedaant Shah, Alan Ritter

Pre-trained multilingual language models have enabled significant advancements in cross-lingual transfer.

Cross-Lingual Transfer Low Resource Named Entity Recognition +4

Paper
Code

Are Large Language Models Robust Coreference Resolvers?

1 code implementation • 23 May 2023 • Nghia T. Le, Alan Ritter

Recent work on extending coreference resolution across domains and languages relies on annotated data in both the target domain and language.

coreference-resolution Domain Adaptation +3

Paper
Code

Distill or Annotate? Cost-Efficient Fine-Tuning of Compact Models

no code implementations • 2 May 2023 • Junmo Kang, Wei Xu, Alan Ritter

Fine-tuning large models is highly effective, however, inference can be expensive and produces carbon emissions.

Knowledge Distillation

Paper
Add Code

Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?

2 code implementations • 23 Feb 2023 • Yang Chen, Hexiang Hu, Yi Luan, Haitian Sun, Soravit Changpinyo, Alan Ritter, Ming-Wei Chang

In this study, we introduce InfoSeek, a visual question answering dataset tailored for information-seeking questions that cannot be answered with only common sense knowledge.

Ranked #3 on Visual Question Answering (VQA) on InfoSeek

Open-Domain Question Answering Visual Question Answering

Paper
Code

Human-in-the-loop Evaluation for Early Misinformation Detection: A Case Study of COVID-19 Treatments

1 code implementation • 19 Dec 2022 • Ethan Mendes, Yang Chen, Wei Xu, Alan Ritter

We present a human-in-the-loop evaluation framework for fact-checking novel misinformation claims and identifying social media messages that support them.

Fact Checking Misinformation

Paper
Code

Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023?

1 code implementation • 19 Dec 2022 • Shuheng Liu, Alan Ritter

In this paper, we evaluate the generalization of over 20 different models trained on CoNLL-2003, and show that NER models have very different generalization.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Frustratingly Easy Label Projection for Cross-lingual Transfer

1 code implementation • 28 Nov 2022 • Yang Chen, Chao Jiang, Alan Ritter, Wei Xu

Translating training data into many languages has emerged as a practical solution for improving cross-lingual transfer.

Ranked #1 on Cross-Lingual NER on MasakhaNER2.0

Cross-Lingual NER NER +4

Paper
Code

Stanceosaurus: Classifying Stance Towards Multilingual Misinformation

no code implementations • 28 Oct 2022 • Jonathan Zheng, Ashutosh Baheti, Tarek Naous, Wei Xu, Alan Ritter

We present Stanceosaurus, a new corpus of 28, 033 tweets in English, Hindi, and Arabic annotated with stance towards 251 misinformation claims.

Domain Adaptation Fact Checking +1

Paper
Add Code

Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context Experts

1 code implementation • 7 Oct 2022 • Nghia T. Le, Fan Bai, Alan Ritter

As far as we are aware, this is the first work to present experimental results demonstrating the effectiveness of in-context learning on the task of few-shot anaphora resolution in scientific protocols.

In-Context Learning Language Modelling +1

Paper
Code

SynKB: Semantic Search for Synthetic Procedures

1 code implementation • 15 Aug 2022 • Fan Bai, Alan Ritter, Peter Madrid, Dayne Freitag, John Niekrasz

In this paper we present SynKB, an open-source, automatically extracted knowledge base of chemical synthesis protocols.

Paper
Code

Pre-train or Annotate? Domain Adaptation with a Constrained Budget

1 code implementation • EMNLP 2021 • Fan Bai, Alan Ritter, Wei Xu

Our experiments suggest task-specific data annotation should be part of an economical strategy when adapting an NLP model to a new domain.

Domain Adaptation

Paper
Code

Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts

1 code implementation • EMNLP 2021 • Ashutosh Baheti, Maarten Sap, Alan Ritter, Mark Riedl

To better understand the dynamics of contextually offensive language, we investigate the stance of dialogue model responses in offensive Reddit conversations.

Dialogue Generation

Paper
Code

Process-Level Representation of Scientific Protocols with Interactive Annotation

2 code implementations • EACL 2021 • Ronen Tamari, Fan Bai, Alan Ritter, Gabriel Stanovsky

We develop Process Execution Graphs (PEG), a document-level representation of real-world wet lab biochemistry protocols, addressing challenges such as cross-sentence relations, long-range coreference, grounding, and implicit arguments.

Relation Extraction Sentence

Paper
Code

WNUT-2020 Task 1 Overview: Extracting Entities and Relations from Wet Lab Protocols

1 code implementation • EMNLP (WNUT) 2020 • Jeniya Tabassum, Sydney Lee, Wei Xu, Alan Ritter

This paper presents the results of the wet lab information extraction task at WNUT 2020.

Ranked #1 on Relation Extraction on WNUT 2020

named-entity-recognition Named Entity Recognition +3

Paper
Code

Model Selection for Cross-Lingual Transfer

1 code implementation • EMNLP 2021 • Yang Chen, Alan Ritter

Transformers that are pre-trained on multilingual corpora, such as, mBERT and XLM-RoBERTa, have achieved impressive cross-lingual transfer capabilities.

Meta-Learning Model Selection +2

Paper
Code

Model Selection for Cross-Lingual Transfer using a Learned Scoring Function

no code implementations • 28 Sep 2020 • Yang Chen, Alan Ritter

In the zero-shot cross-lingual transfer setting, only English training data is assumed, and the fine-tuned model is evaluated on another target language.

Model Selection Transfer Learning +1

Paper
Add Code

Measuring Forecasting Skill from Text

1 code implementation • ACL 2020 • Shi Zong, Alan Ritter, Eduard Hovy

We present a number of linguistic metrics which are computed over text associated with people's predictions about the future including: uncertainty, readability, and emotion.

Paper
Code

Extracting a Knowledge Base of COVID-19 Events from Social Media

2 code implementations • COLING 2022 • Shi Zong, Ashutosh Baheti, Wei Xu, Alan Ritter

In this paper, we present a manually annotated corpus of 10, 000 tweets containing public reports of five COVID-19 events, including positive and negative tests, deaths, denied access to testing, claimed cures and preventions.

Extracting COVID-19 Events from Twitter slot-filling

Paper
Code

Fluent Response Generation for Conversational Question Answering

1 code implementation • ACL 2020 • Ashutosh Baheti, Alan Ritter, Kevin Small

In this work, we propose a method for situating QA responses within a SEQ2SEQ NLG approach to generate fluent grammatical answer responses while maintaining correctness.

Conversational Question Answering Data Augmentation +3

Paper
Code

Code and Named Entity Recognition in StackOverflow

2 code implementations • ACL 2020 • Jeniya Tabassum, Mounica Maddela, Wei Xu, Alan Ritter

We also present the SoftNER model which achieves an overall 79. 10 F$_1$ score for code and named entity recognition on StackOverflow data.

named-entity-recognition Named Entity Recognition +1

143

Paper
Code

An Empirical Study of Pre-trained Transformers for Arabic Information Extraction

1 code implementation • EMNLP 2020 • Wuwei Lan, Yang Chen, Wei Xu, Alan Ritter

Multilingual pre-trained Transformers, such as mBERT (Devlin et al., 2019) and XLM-RoBERTa (Conneau et al., 2020a), have been shown to enable the effective cross-lingual zero-shot transfer.

Cross-Lingual Transfer Language Modelling +10

Paper
Code

SemEval-2013 Task 2: Sentiment Analysis in Twitter

no code implementations • SEMEVAL 2013 • Preslav Nakov, Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, Theresa Wilson

To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Twitter, which included two subtasks: A, an expression-level subtask, and B, a message-level subtask.

Sentiment Analysis Task 2

Paper
Add Code

SemEval-2014 Task 9: Sentiment Analysis in Twitter

no code implementations • SEMEVAL 2014 • Sara Rosenthal, Preslav Nakov, Alan Ritter, Veselin Stoyanov

We describe the Sentiment Analysis in Twitter task, ran as part of SemEval-2014.

Sentiment Analysis

Paper
Add Code

SemEval-2015 Task 10: Sentiment Analysis in Twitter

no code implementations • SEMEVAL 2015 • Sara Rosenthal, Saif M. Mohammad, Preslav Nakov, Alan Ritter, Svetlana Kiritchenko, Veselin Stoyanov

In this paper, we describe the 2015 iteration of the SemEval shared task on Sentiment Analysis in Twitter.

Sentiment Analysis

Paper
Add Code

SemEval-2016 Task 4: Sentiment Analysis in Twitter

no code implementations • SEMEVAL 2016 • Preslav Nakov, Alan Ritter, Sara Rosenthal, Fabrizio Sebastiani, Veselin Stoyanov

The three new subtasks focus on two variants of the basic ``sentiment classification in Twitter'' task.

General Classification Sentiment Analysis +1

Paper
Add Code

Structured Minimally Supervised Learning for Neural Relation Extraction

1 code implementation • NAACL 2019 • Fan Bai, Alan Ritter

Our approach achieves state-of-the-art results on minimally supervised sentential relation extraction, outperforming a number of baselines, including a competitive approach that uses the attention layer of a purely neural model.

Relation Relation Extraction +1

Paper
Code

Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media

1 code implementation • NAACL 2019 • Shi Zong, Alan Ritter, Graham Mueller, Evan Wright

In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online.

Paper
Code

Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints

2 code implementations • EMNLP 2018 • Ashutosh Baheti, Alan Ritter, Jiwei Li, Bill Dolan

Neural conversation models tend to generate safe, generic responses for most inputs.

Paper
Code

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

no code implementations • NAACL 2018 • Chaitanya Kulkarni, Wei Xu, Alan Ritter, Raghu Machiraju

We make our annotated Wet Lab Protocol Corpus available to the research community.

BIG-bench Machine Learning Reading Comprehension +1

Paper
Add Code

``i have a feeling trump will win..................'': Forecasting Winners and Losers from User Predictions on Twitter

no code implementations • EMNLP 2017 • S Swamy, esh, Alan Ritter, Marie-Catherine de Marneffe

Social media users often make explicit predictions about upcoming events.

Sentiment Analysis

Paper
Add Code

"i have a feeling trump will win..................": Forecasting Winners and Losers from User Predictions on Twitter

1 code implementation • EMNLP 2017 • Sandesh Swamy, Alan Ritter, Marie-Catherine de Marneffe

Social media users often make explicit predictions about upcoming events.

Paper
Code

Adversarial Learning for Neural Dialogue Generation

8 code implementations • EMNLP 2017 • Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky

In this paper, drawing intuition from the Turing test, we propose using adversarial training for open-domain dialogue generation: the system is trained to produce sequences that are indistinguishable from human-generated dialogue utterances.

Ranked #1 on Dialogue Generation on Amazon-5

Dialogue Evaluation Dialogue Generation +1

188

Paper
Code

Results of the WNUT16 Named Entity Recognition Shared Task

no code implementations • WS 2016 • Benjamin Strauss, Bethany Toma, Alan Ritter, Marie-Catherine de Marneffe, Wei Xu

This paper presents the results of the Twitter Named Entity Recognition shared task associated with W-NUT 2016: a named entity tagging task with 10 teams participating.

Named Entity Recognition Named Entity Recognition (NER)

Paper
Add Code

TweeTime : A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter

1 code implementation • EMNLP 2016 • Jeniya Tabassum, Alan Ritter, Wei Xu

Information Retrieval Knowledge Base Population

Paper
Code

TweeTime: A Minimally Supervised Method for Recognizing and Normalizing Time Expressions in Twitter

1 code implementation • 9 Aug 2016 • Jeniya Tabassum, Alan Ritter, Wei Xu

We describe TweeTIME, a temporal tagger for recognizing and normalizing time expressions in Twitter.

Paper
Code

Deep Reinforcement Learning for Dialogue Generation

8 code implementations • EMNLP 2016 • Jiwei Li, Will Monroe, Alan Ritter, Michel Galley, Jianfeng Gao, Dan Jurafsky

Recent neural models of dialogue generation offer great promise for generating responses for conversational agents, but tend to be shortsighted, predicting utterances one at a time while ignoring their influence on future outcomes.

Dialogue Generation Policy Gradient Methods +2

194

Paper
Code

Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks

no code implementations • 18 Oct 2015 • Jiwei Li, Alan Ritter, Dan Jurafsky

Inferring latent attributes of people online is an important social computing task, but requires integrating the many heterogeneous sources of information available on the web.

Community Detection Link Prediction

Paper
Add Code

Shared Tasks of the 2015 Workshop on Noisy User-generated Text: Twitter Lexical Normalization and Named Entity Recognition

no code implementations • WS 2015 • Timothy Baldwin, Marie Catherine de Marneffe, Bo Han, Young-Bum Kim, Alan Ritter, Wei Xu

Lexical Normalization named-entity-recognition +2

Paper
Add Code

Sense Discovery via Co-Clustering on Images and Text

no code implementations • CVPR 2015 • Xinlei Chen, Alan Ritter, Abhinav Gupta, Tom Mitchell

We present a co-clustering framework that can be used to discover multiple semantic and visual senses of a given Noun Phrase (NP).

Clustering

Paper
Add Code

Inferring User Preferences by Probabilistic Logical Reasoning over Social Networks

no code implementations • 11 Nov 2014 • Jiwei Li, Alan Ritter, Dan Jurafsky

by building a probabilistic model that reasons over user attributes (the user's location or gender) and the social network (the user's friends and spouse), via inferences like homophily (I am more likely to like sushi if spouse or friends like sushi, I am more likely to like the Knicks if I live in New York).

Attribute Logical Reasoning +1