1 code implementation • 4 Apr 2024 • Zhangdie Yuan, Chenxi Whitehouse, Eric Chamoun, Rami Aly, Andreas Vlachos
This paper introduces PRobELM (Plausibility Ranking Evaluation for Language Models), a benchmark designed to assess language models' ability to discern more plausible from less plausible scenarios through their parametric knowledge.
1 code implementation • 21 Jan 2024 • Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks
Interpreting hierarchical structures latent in language is a key limitation of current language models (LMs).
2 code implementations • 4 Jan 2024 • Songbo Hu, Xiaobin Wang, Zhangdie Yuan, Anna Korhonen, Ivan Vulić
We present DIALIGHT, a toolkit for developing and evaluating multilingual Task-Oriented Dialogue (ToD) systems which facilitates systematic evaluations and comparisons between ToD systems using fine-tuning of Pretrained Language Models (PLMs) and those utilising the zero-shot and in-context learning capabilities of Large Language Models (LLMs).
no code implementations • 19 Dec 2023 • Zhangdie Yuan, Andreas Vlachos
Despite progress in automated fact-checking, most systems require a significant amount of labeled training data, which is expensive.
1 code implementation • 22 Oct 2022 • Nedjma Ousidhoum, Zhangdie Yuan, Andreas Vlachos
Our method outperforms previous work on a fact-checking question generation dataset on a wide range of automatic evaluation metrics.
1 code implementation • 12 Oct 2022 • Zhangdie Yuan, Songbo Hu, Ivan Vulić, Anna Korhonen, Zaiqiao Meng
Acquiring factual knowledge with Pretrained Language Models (PLMs) has attracted increasing attention, showing promising performance in many knowledge-intensive tasks.