no code implementations • COLING 2022 • Jingyuan Wen, Yutian Luo, Nanyi Fei, Guoxing Yang, Zhiwu Lu, Hao Jiang, Jie Jiang, Zhao Cao
In few-shot text classification, a feasible paradigm for deploying VL-PTMs is to align the input samples and their category names via the text encoders.
no code implementations • 20 Mar 2024 • Qi Liu, Gang Guo, Jiaxin Mao, Zhicheng Dou, Ji-Rong Wen, Hao Jiang, Xinyu Zhang, Zhao Cao
Based on these findings, we then propose several simple document pruning methods to reduce the storage overhead and compare the effectiveness of different pruning methods on different late-interaction models.
1 code implementation • 21 Dec 2023 • Jiayu Lin, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei
The results show the competitiveness of our proposed framework and evaluator in counter-argument generation tasks.
1 code implementation • 1 Dec 2023 • Jingcong Liang, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei
In this paper, we propose the Hierarchical Argumentation Graph (Hi-ArG), a new structure to organize arguments.
no code implementations • 30 Nov 2023 • Zhebin Zhang, Xinyu Zhang, Yuanhang Ren, Saijiang Shi, Meng Han, Yongkang Wu, Ruofei Lai, Zhao Cao
In this paper, we propose an Induction-Augmented Generation (IAG) framework that utilizes inductive knowledge along with the retrieved documents for implicit reasoning.
1 code implementation • 27 Nov 2023 • Zhen Tian, Changwang Zhang, Wayne Xin Zhao, Xin Zhao, Ji-Rong Wen, Zhao Cao
To address the above issue, we propose the Universal Feature Interaction Network (UFIN) approach for CTR prediction.
no code implementations • 30 Aug 2023 • Hongjin Qian, Zhicheng Dou, Jiejun Tan, Haonan Chen, Haoqi Gu, Ruofei Lai, Xinyu Zhang, Zhao Cao, Ji-Rong Wen
Previous methods use external knowledge as references for text generation to enhance factuality but often struggle with the knowledge mix-up(e. g., entity mismatch) of irrelevant references.
1 code implementation • 21 Jul 2023 • Zhipeng Zhao, Kun Zhou, Xiaolei Wang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen
Conversational recommender systems (CRS) aim to provide the recommendation service via natural language conversations.
no code implementations • 19 Jul 2023 • Qingyao Ai, Ting Bai, Zhao Cao, Yi Chang, Jiawei Chen, Zhumin Chen, Zhiyong Cheng, Shoubin Dong, Zhicheng Dou, Fuli Feng, Shen Gao, Jiafeng Guo, Xiangnan He, Yanyan Lan, Chenliang Li, Yiqun Liu, Ziyu Lyu, Weizhi Ma, Jun Ma, Zhaochun Ren, Pengjie Ren, Zhiqiang Wang, Mingwen Wang, Ji-Rong Wen, Le Wu, Xin Xin, Jun Xu, Dawei Yin, Peng Zhang, Fan Zhang, Weinan Zhang, Min Zhang, Xiaofei Zhu
The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs.
1 code implementation • 2 Jul 2023 • Quan Tu, Shen Gao, Xiaolong Wu, Zhao Cao, Ji-Rong Wen, Rui Yan
Conversational search has been regarded as the next-generation search paradigm.
1 code implementation • 5 Jun 2023 • Xiaolei Wang, Kun Zhou, Xinyu Tang, Wayne Xin Zhao, Fan Pan, Zhao Cao, Ji-Rong Wen
To develop our approach, we characterize user preference and organize the conversation flow by the entities involved in the dialogue, and design a multi-stage recommendation dialogue simulator based on a conversation flow language model.
1 code implementation • 28 May 2023 • Chaojun Xiao, Zhengyan Zhang, Xu Han, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Xiangyang Li, Zhonghua Li, Zhao Cao, Maosong Sun
By inserting document plugins into the backbone PTM for downstream tasks, we can encode a document one time to handle multiple tasks, which is more efficient than conventional encoding-task coupling methods that simultaneously encode documents and input queries using task-specific encoders.
1 code implementation • 23 May 2023 • Peitian Zhang, Zheng Liu, Yujia Zhou, Zhicheng Dou, Fangchao Liu, Zhao Cao
On top of the term-set DocID, we propose a permutation-invariant decoding algorithm, with which the term set can be generated in any permutation yet will always lead to the corresponding document.
1 code implementation • 4 May 2023 • Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
It is designed to improve the quality of semantic representation where all contextualized embeddings of the pre-trained model can be leveraged.
1 code implementation • 24 Apr 2023 • Haitao Li, Qingyao Ai, Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Zheng Liu, Zhao Cao
Unfortunately, while ANN can improve the efficiency of DR models, it usually comes with a significant price on retrieval performance.
1 code implementation • 21 Apr 2023 • Zhen Tian, Ting Bai, Wayne Xin Zhao, Ji-Rong Wen, Zhao Cao
EulerNet converts the exponential powers of feature interactions into simple linear combinations of the modulus and phase of the complex features, making it possible to adaptively learn the high-order feature interactions in an efficient way.
1 code implementation • 12 Apr 2023 • Si Sun, Yida Lu, Shi Yu, Xiangyang Li, Zhonghua Li, Zhao Cao, Zhiyuan Liu, Deiming Ye, Jie Bao
Moreover, the dataset is disjointed into base and novel classes, allowing DR models to be continuously trained on ample data from base classes and a few samples in novel classes.
1 code implementation • CVPR 2023 • Yunpeng Han, Lisai Zhang, Qingcai Chen, Zhijian Chen, Zhonghua Li, Jianxin Yang, Zhao Cao
We propose a method for fine-grained fashion vision-language pre-training based on fashion Symbols and Attributes Prompt (FashionSAP) to model fine-grained multi-modalities fashion attributes and characteristics.
1 code implementation • 10 Apr 2023 • Hongjing Qian, Yutao Zhu, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, Ji-Rong Wen
In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web.
no code implementations • 9 Mar 2023 • Lisai Zhang, Qingcai Chen, Zhijian Chen, Yunpeng Han, Zhonghua Li, Zhao Cao
In this paper, we propose a fine-grained VLP scheme without object annotations from the linguistic perspective.
2 code implementations • 19 Dec 2022 • Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.
no code implementations • 14 Sep 2022 • Jiawen Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Zikai Guo, Zhaoye Fei, Ruofei Lai, Yongkang Wu, Zhao Cao, Zhicheng Dou
Hyperlinks, which are commonly used in Web pages, have been leveraged for designing pre-training objectives.
1 code implementation • 2 Sep 2022 • Qian Cao, Xu Chen, Ruihua Song, Hao Jiang, Guang Yang, Zhao Cao
To model such human capabilities, in this paper, we define and solve a novel AI creation problem based on human experiences.
1 code implementation • 23 Aug 2022 • Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, Ji-Rong Wen
To help the encoding of the current user behavior sequence, we propose to use a decoder and the information of future sequences and a supplemental query.
no code implementations • COLING 2022 • Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu
Our experiments on 13 benchmark datasets across five natural language understanding tasks demonstrate the superiority of our method.
1 code implementation • 24 May 2022 • Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
The sentence embedding is generated from the encoder's masked input; then, the original sentence is recovered based on the sentence embedding and the decoder's masked input via masked language modeling.
Ranked #1 on Information Retrieval on MSMARCO
1 code implementation • ACL 2022 • Jiawei Zhou, Xiaoguang Li, Lifeng Shang, Lan Luo, Ke Zhan, Enrui Hu, Xinyu Zhang, Hao Jiang, Zhao Cao, Fan Yu, Xin Jiang, Qun Liu, Lei Chen
To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR).
no code implementations • 28 Feb 2022 • Daniel Gao, Yantao Jia, Lei LI, Chengzhen Fu, Zhicheng Dou, Hao Jiang, Xinyu Zhang, Lei Chen, Zhao Cao
However, to figure out whether PLMs can be reliable knowledge sources and used as alternative knowledge bases (KBs), we need to further explore some critical features of PLMs.
no code implementations • 20 Oct 2021 • Lisai Zhang, Hongfa Wu, Qingcai Chen, Yimeng Deng, Zhonghua Li, Dejiang Kong, Zhao Cao, Joanna Siebert, Yunpeng Han
Cross-model retrieval has emerged as one of the most important upgrades for text-only search engines (SE).
no code implementations • 14 Oct 2021 • Hao Jiang, Ke Zhan, Jianwei Qu, Yongkang Wu, Zhaoye Fei, Xinyu Zhang, Lei Chen, Zhicheng Dou, Xipeng Qiu, Zikai Guo, Ruofei Lai, Jiawen Wu, Enrui Hu, Yinxia Zhang, Yantao Jia, Fan Yu, Zhao Cao
To increase the number of activated experts without an increase in computational cost, we propose SAM (Switch and Mixture) routing, an efficient hierarchical routing mechanism that activates multiple experts in a same device (GPU).
1 code implementation • NAACL 2022 • Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.
no code implementations • 14 Sep 2021 • Ruizhi Pu, Xinyu Zhang, Ruofei Lai, Zikai Guo, Yinxia Zhang, Hao Jiang, Yongkang Wu, Yantao Jia, Zhicheng Dou, Zhao Cao
Finally, supervisory signal in rear compressor is computed based on condition probability and thus can control sample dynamic and further enhance the model performance.
1 code implementation • 20 Aug 2021 • Zhengyi Ma, Zhicheng Dou, Wei Xu, Xinyu Zhang, Hao Jiang, Zhao Cao, Ji-Rong Wen
In this paper, we propose to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval.
3 code implementations • Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021 • Xinyu Zhang, Ke Zhan, Enrui Hu, Chengzhen Fu, Lan Luo, Hao Jiang, Yantao Jia, Fan Yu, Zhicheng Dou, Zhao Cao, Lei Chen
Currently, the most popular method for open-domain Question Answering (QA) adopts "Retriever and Reader" pipeline, where the retriever extracts a list of candidate documents from a large set of documents followed by a ranker to rank the most relevant documents and the reader extracts answer from the candidates.
no code implementations • 28 May 2021 • Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
In this paper, we show that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory.