Search Results for author: Qing-Guo Chen

Found 9 papers, 1 papers with code

Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees

no code implementations • 11 Jun 2024 • Sijia Chen, Yibo Wang, Yi-Feng Wu, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Lijun Zhang

In this study, we propose an inference trajectory optimization framework based on the preference data extracted from decision trees to address this limitation.

Paper
Add Code

Wings: Learning Multimodal LLMs without Text-only Forgetting

no code implementations • 5 Jun 2024 • Yi-Kai Zhang, Shiyin Lu, Yang Li, Yanqing Ma, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

Initially, image and text inputs are aligned with visual learners operating alongside the main attention, balancing focus on visual elements.

Question Answering Visual Question Answering

Paper
Add Code

Parrot: Multilingual Visual Instruction Tuning

no code implementations • 4 Jun 2024 • Hai-Long Sun, Da-Wei Zhou, Yang Li, Shiyin Lu, Chao Yi, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye

In this paper, we introduce Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level.

Paper
Add Code

Ovis: Structural Embedding Alignment for Multimodal Large Language Model

no code implementations • 31 May 2024 • Shiyin Lu, Yang Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Han-Jia Ye

However, the misalignment between two embedding strategies in MLLMs -- the structural textual embeddings based on an embedding look-up table and the continuous embeddings generated directly by the vision encoder -- makes challenges for a more seamless fusion of visual and textual information.

Language Modelling Large Language Model

Paper
Add Code

TAI++: Text as Image for Multi-Label Image Classification by Co-Learning Transferable Prompt

no code implementations • 11 May 2024 • Xiangyu Wu, Qing-Yuan Jiang, Yang Yang, Yi-Feng Wu, Qing-Guo Chen, Jianfeng Lu

Then, a co-learning strategy with a dual-adapter module is designed to transfer visual knowledge from pseudo-visual prompt to text prompt, enhancing their visual representation abilities.

Multi-Label Image Classification Visual Prompt Tuning

Paper
Add Code

Sparse Attentive Memory Network for Click-through Rate Prediction with Long Sequences

1 code implementation • 8 Aug 2022 • Qianying Lin, Wen-Ji Zhou, Yanshi Wang, Qing Da, Qing-Guo Chen, Bing Wang

SAM supports efficient training and real-time inference for user behavior sequences with lengths on the scale of thousands.

Click-Through Rate Prediction Sequential Recommendation

Paper
Code

Iterative Memory Network for Long Sequential User Behavior Modeling in Recommender Systems

no code implementations • 29 Sep 2021 • Qianying Lin, Wen-Ji Zhou, Yanshi Wang, Qing Da, Qing-Guo Chen, Bing Wang

Extensive empirical studies show that our method outperforms various state-of-the-art sequential modeling methods on both public and industrial datasets for long sequential user behavior modeling.

Recommendation Systems