TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Entity Resolution	Abt-Buy	RobEM	F1 (%)	90.90	# 3
Entity Resolution	Amazon-Google	RobEM	F1 (%)	79.06	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/probing-the-robustness-of-pre-trained/entity-resolution-on-abt-buy)](https://paperswithcode.com/sota/entity-resolution-on-abt-buy?p=probing-the-robustness-of-pre-trained)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/probing-the-robustness-of-pre-trained/entity-resolution-on-amazon-google)](https://paperswithcode.com/sota/entity-resolution-on-amazon-google?p=probing-the-robustness-of-pre-trained)`

Probing the Robustness of Pre-trained Language Models for Entity Matching

ACM International Conference on Information & Knowledge Management (CIKM) 2022 · Mehdi Akbarian Rastaghi, Ehsan Kamalloo, Davood Rafiei ·

The paradigm of fine-tuning Pre-trained Language Models (PLMs) has been successful in Entity Matching (EM). Despite their remarkable performance, PLMs exhibit tendency to learn spurious correlations from training data. In this work, we aim at investigating whether PLM-based entity matching models can be trusted in real-world applications where data distribution is different from that of training. To this end, we design an evaluation benchmark to assess the robustness of EM models to facilitate their deployment in the real-world settings. Our assessments reveal that data imbalance in the training data is a key problem for robustness. We also find that data augmentation alone is not sufficient to make a model robust. As a remedy, we prescribe simple modifications that can improve the robustness of PLM-based EM models. Our experiments show that while yielding superior results for in-domain generalization, our proposed model significantly improves the model robustness, compared to state-of-the-art EM models.

PDF Abstract