no code implementations • NAACL (ALVR) 2021 • Julia Suter, Letitia Parcalabescu, Anette Frank
Phrase grounding (PG) is a multimodal task that grounds language in images.
1 code implementation • 29 Apr 2024 • Letitia Parcalabescu, Anette Frank
We also evaluate the self-consistency of VLM decoders in both post-hoc and CoT explanation settings, by extending existing tests and measures to VLM decoders.
2 code implementations • 13 Nov 2023 • Letitia Parcalabescu, Anette Frank
In this work we argue that these faithfulness tests do not measure faithfulness to the models' inner workings -- but rather their self-consistency at output level.
no code implementations • 13 Nov 2023 • Ilker Kesen, Andrea Pedrotti, Mustafa Dogan, Michele Cafagna, Emre Can Acikgoz, Letitia Parcalabescu, Iacer Calixto, Anette Frank, Albert Gatt, Aykut Erdem, Erkut Erdem
With the ever-increasing popularity of pretrained Video-Language Models (VidLMs), there is a pressing need to develop robust evaluation methodologies that delve deeper into their visio-linguistic capabilities.
1 code implementation • 15 Dec 2022 • Letitia Parcalabescu, Anette Frank
We apply MM-SHAP in two ways: (1) to compare models for their average degree of multimodality, and (2) to measure for individual models the contribution of individual modalities for different tasks and datasets.
1 code implementation • ACL 2022 • Letitia Parcalabescu, Michele Cafagna, Lilitta Muradjan, Anette Frank, Iacer Calixto, Albert Gatt
We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena.
1 code implementation • 9 Dec 2021 • Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
Large-scale pretraining is fast becoming the norm in Vision-Language (VL) modeling.
no code implementations • ACL (mmsr, IWCS) 2021 • Letitia Parcalabescu, Nils Trost, Anette Frank
The last years have shown rapid developments in the field of multimodal machine learning, combining e. g., vision, text or speech.
no code implementations • ACL (mmsr, IWCS) 2021 • Letitia Parcalabescu, Albert Gatt, Anette Frank, Iacer Calixto
We investigate the reasoning ability of pretrained vision and language (V&L) models in two tasks that require multimodal integration: (1) discriminating a correct image-sentence pair from an incorrect one, and (2) counting entities in an image.
3 code implementations • 29 Jan 2020 • Juri Opitz, Letitia Parcalabescu, Anette Frank
Different metrics have been proposed to compare Abstract Meaning Representation (AMR) graphs.
Ranked #4 on Graph Matching on RARE