no code implementations • 8 Mar 2024 • Wei Zhou, Heike Adel, Hendrik Schuff, Ngoc Thang Vu
Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour.
no code implementations • 13 Nov 2023 • Sheng Lu, Hendrik Schuff, Iryna Gurevych
In-context learning (ICL) has become one of the most popular learning paradigms.
1 code implementation • 13 Sep 2023 • Tilman Beck, Hendrik Schuff, Anne Lauscher, Iryna Gurevych
However, the available NLP literature disagrees on the efficacy of this technique - it remains unclear for which tasks and scenarios it can help, and the role of the individual factors in sociodemographic prompting is still unexplored.
1 code implementation • 4 May 2023 • Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, Yoav Goldberg
Word-level saliency explanations ("heat maps over words") are often used to communicate feature-attribution in text-based models.
no code implementations • 13 Oct 2022 • Hendrik Schuff, Heike Adel, Peng Qi, Ngoc Thang Vu
This approach assumes that explanations which reach higher proxy scores will also provide a greater benefit to human users.
1 code implementation • 27 Jan 2022 • Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg, Ngoc Thang Vu
In this work, we focus on this question through a study of saliency-based explanations over textual data.
1 code implementation • EMNLP (BlackboxNLP) 2021 • Hendrik Schuff, Hsiu-Yu Yang, Heike Adel, Ngoc Thang Vu
For this, we investigate different sources of external knowledge and evaluate the performance of our models on in-domain data as well as on special transfer datasets that are designed to assess fine-grained reasoning capabilities.
no code implementations • 26 Jul 2021 • Hendrik Schuff, Heike Adel, Ngoc Thang Vu
In addition, we conduct a qualitative analysis of thought flow correction patterns and explore how thought flow predictions affect human users within a crowdsourcing study.
1 code implementation • EMNLP 2020 • Hendrik Schuff, Heike Adel, Ngoc Thang Vu
The user study shows that our models increase the ability of the users to judge the correctness of the system and that scores like F1 are not enough to estimate the usefulness of a model in a practical setting with human users.
no code implementations • WS 2017 • Hendrik Schuff, Jeremy Barnes, Julian Mohme, Sebastian Pad{\'o}, Roman Klinger
There is a rich variety of data sets for sentiment analysis (viz., polarity and subjectivity classification).