1 code implementation • 24 Apr 2024 • Junfeng Tian, Rui Wang, Cong Li, Yudong Zhou, Jun Liu, Jun Wang
This report details the development and key achievements of our latest language model designed for custom large language models.
1 code implementation • 14 Oct 2023 • Junjie Ye, Jie zhou, Junfeng Tian, Rui Wang, Qi Zhang, Tao Gui, Xuanjing Huang
Recently, Target-oriented Multimodal Sentiment Classification (TMSC) has gained significant attention among scholars.
2 code implementations • 8 Oct 2023 • Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Guohai Xu, Chenliang Li, Junfeng Tian, Qi Qian, Ji Zhang, Qin Jin, Liang He, Xin Alex Lin, Fei Huang
Text is ubiquitous in our visual world, conveying crucial information, such as in documents, websites, and everyday photographs.
1 code implementation • 4 Jul 2023 • Jiabo Ye, Anwen Hu, Haiyang Xu, Qinghao Ye, Ming Yan, Yuhao Dan, Chenlin Zhao, Guohai Xu, Chenliang Li, Junfeng Tian, Qian Qi, Ji Zhang, Fei Huang
Nevertheless, without in-domain training, these models tend to ignore fine-grained OCR features, such as sophisticated tables or large blocks of text, which are essential for OCR-free document understanding.
1 code implementation • 27 Apr 2023 • Qinghao Ye, Haiyang Xu, Guohai Xu, Jiabo Ye, Ming Yan, Yiyang Zhou, Junyang Wang, Anwen Hu, Pengcheng Shi, Yaya Shi, Chenliang Li, Yuanhong Xu, Hehong Chen, Junfeng Tian, Qi Qian, Ji Zhang, Fei Huang, Jingren Zhou
Our code, pre-trained model, instruction-tuned models, and evaluation set are available at https://github. com/X-PLUG/mPLUG-Owl.
Ranked #3 on Visual Question Answering (VQA) on HallusionBench
Visual Question Answering (VQA) Zero-Shot Video Question Answer
1 code implementation • 16 Apr 2023 • Junfeng Tian, Hehong Chen, Guohai Xu, Ming Yan, Xing Gao, Jianhai Zhang, Chenliang Li, Jiayi Liu, Wenshen Xu, Haiyang Xu, Qi Qian, Wei Wang, Qinghao Ye, Jiejing Zhang, Ji Zhang, Fei Huang, Jingren Zhou
In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.
3 code implementations • 24 May 2022 • Chenliang Li, Haiyang Xu, Junfeng Tian, Wei Wang, Ming Yan, Bin Bi, Jiabo Ye, Hehong Chen, Guohai Xu, Zheng Cao, Ji Zhang, Songfang Huang, Fei Huang, Jingren Zhou, Luo Si
Large-scale pretrained foundation models have been an emerging paradigm for building artificial intelligence (AI) systems, which can be quickly adapted to a wide range of downstream tasks.
Ranked #1 on Image Captioning on COCO Captions
3 code implementations • ACL 2022 • Xuwu Wang, Junfeng Tian, Min Gui, Zhixu Li, Rui Wang, Ming Yan, Lihan Chen, Yanghua Xiao
In this paper, we present WikiDiverse, a high-quality human-annotated MEL dataset with diversified contextual topics and entity types from Wikinews, which uses Wikipedia as the corresponding knowledge base.
1 code implementation • CVPR 2022 • Jiabo Ye, Junfeng Tian, Ming Yan, Xiaoshan Yang, Xuwu Wang, Ji Zhang, Liang He, Xin Lin
Moreover, since the backbones are query-agnostic, it is difficult to completely avoid the inconsistency issue by training the visual backbone end-to-end in the visual grounding framework.
no code implementations • 17 Nov 2021 • Ming Yan, Haiyang Xu, Chenliang Li, Junfeng Tian, Bin Bi, Wei Wang, Weihua Chen, Xianzhe Xu, Fan Wang, Zheng Cao, Zhicheng Zhang, Qiyu Zhang, Ji Zhang, Songfang Huang, Fei Huang, Luo Si, Rong Jin
The Visual Question Answering (VQA) task utilizes both visual image and language analysis to answer a textual question with respect to an image.
Ranked #8 on Visual Question Answering (VQA) on VQA v2 test-dev
no code implementations • 21 Aug 2021 • Ming Yan, Haiyang Xu, Chenliang Li, Bin Bi, Junfeng Tian, Min Gui, Wei Wang
Existing approaches to vision-language pre-training (VLP) heavily rely on an object detector based on bounding boxes (regions), where salient objects are first detected from images and then a Transformer-based model is used for cross-modal fusion.
no code implementations • SEMEVAL 2021 • Junfeng Tian, Min Gui, Chenliang Li, Ming Yan, Wenming Xiao
We describe our systems of subtask1 and subtask3 for SemEval-2021 Task 6 on Detection of Persuasion Techniques in Texts and Images.
1 code implementation • COLING 2020 • Jie zhou, Junfeng Tian, Rui Wang, Yuanbin Wu, Wenming Xiao, Liang He
However, due to the variety of users{'} emotional expressions across domains, fine-tuning the pre-trained models on the source domain tends to overfit, leading to inferior results on the target domain.
1 code implementation • ACL 2020 • Kai Wang, Junfeng Tian, Rui Wang, Xiaojun Quan, Jianxing Yu
Unlike those pipeline approaches, our act generation module preserves the semantic structures of multi-domain dialogue acts and our response generation module dynamically attends to different acts as needed.
no code implementations • IJCNLP 2019 • Min Gui, Junfeng Tian, Rui Wang, Zhenglu Yang
Attention plays a key role in the improvement of sequence-to-sequence-based document summarization models.
no code implementations • SEMEVAL 2018 • Junfeng Tian, Man Lan, Yuanbin Wu
This paper presents our submissions to SemEval 2018 Task 12: the Argument Reasoning Comprehension Task.
no code implementations • 5 Jan 2018 • Jingang Wang, Junfeng Tian, Long Qiu, Sheng Li, Jun Lang, Luo Si, Man Lan
It is a challenging and practical research problem to obtain effective compression of lengthy product titles for E-commerce.
no code implementations • SEMEVAL 2017 • Junfeng Tian, Zhiheng Zhou, Man Lan, Yuanbin Wu
To address semantic similarity on multilingual and cross-lingual sentences, we firstly translate other foreign languages into English, and then feed our monolingual English system with various interactive features.
Cross-Lingual Semantic Textual Similarity Machine Translation +1