no code implementations • COLING 2022 • Yitian Li, Jidong Tian, Wenqing Chen, Caoyun Fan, Hao He, Yaohui Jin
In this paper, we propose a systematic method to diagnose the correlations between an NLU dataset and a specific skill, and then take a fundamental reasoning skill, logical reasoning, as an example for analysis.
no code implementations • 12 Dec 2023 • Caoyun Fan, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
In-Context Learning (ICL) is an important paradigm for adapting Large Language Models (LLMs) to downstream tasks through a few demonstrations.
no code implementations • 9 Dec 2023 • Caoyun Fan, Jindou Chen, Yaohui Jin, Hao He
With the high alignment between the behavior of Large Language Models (LLMs) and humans, a promising research direction is to employ LLMs as substitutes for humans in game experiments, enabling social science research.
no code implementations • 18 Oct 2023 • Caoyun Fan, Jidong Tian, Yitian Li, Wenqing Chen, Hao He, Yaohui Jin
From the perspective of CoT, CoTT's two-step framework enables MLMs to implement task decomposition; CoTT's prompt tuning allows intermediate steps to be used in natural language form.
no code implementations • 11 Oct 2023 • Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
In this study, we attribute the bias to the model's misuse of label dependency, i. e., the model tends to utilize the correlation shortcut in label dependency rather than fusing text information and label dependency for prediction.
no code implementations • 10 Oct 2023 • Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
Counterfactually-Augmented Data (CAD) -- minimal editing of sentences to flip the corresponding labels -- has the potential to improve the Out-Of-Distribution (OOD) generalization capability of language models, as CAD induces language models to exploit domain-independent causal features and exclude spurious correlations.
no code implementations • 18 Feb 2023 • Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
Counterfactually-Augmented Data (CAD) has the potential to improve language models' Out-Of-Distribution (OOD) generalization capability, as CAD induces language models to exploit causal features and exclude spurious correlations.
no code implementations • 18 Feb 2023 • Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin
A series of studies point out that too much gradient noise would lead to performance degradation in STL, however, in the MTL scenario, Inter-Task Gradient Noise (ITGN) is an additional source of gradient noise for each task, which can also affect the optimization process.
no code implementations • 18 May 2021 • Wenqing Chen, Jidong Tian, Caoyun Fan, Hao He, Yaohui Jin
The intermediate task would help the model better understand the visual features and thus alleviate the content inconsistency problem.