no code implementations • 11 Jun 2024 • Sijia Chen, Yibo Wang, Yi-Feng Wu, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Lijun Zhang
In this study, we propose an inference trajectory optimization framework based on the preference data extracted from decision trees to address this limitation.
no code implementations • 5 Jun 2024 • Yi-Kai Zhang, Shiyin Lu, Yang Li, Yanqing Ma, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye
Initially, image and text inputs are aligned with visual learners operating alongside the main attention, balancing focus on visual elements.
no code implementations • 4 Jun 2024 • Hai-Long Sun, Da-Wei Zhou, Yang Li, Shiyin Lu, Chao Yi, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, De-Chuan Zhan, Han-Jia Ye
In this paper, we introduce Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level.
no code implementations • 31 May 2024 • Shiyin Lu, Yang Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Han-Jia Ye
However, the misalignment between two embedding strategies in MLLMs -- the structural textual embeddings based on an embedding look-up table and the continuous embeddings generated directly by the vision encoder -- makes challenges for a more seamless fusion of visual and textual information.
no code implementations • 11 May 2024 • Xiangyu Wu, Qing-Yuan Jiang, Yang Yang, Yi-Feng Wu, Qing-Guo Chen, Jianfeng Lu
Then, a co-learning strategy with a dual-adapter module is designed to transfer visual knowledge from pseudo-visual prompt to text prompt, enhancing their visual representation abilities.
1 code implementation • 8 Aug 2022 • Qianying Lin, Wen-Ji Zhou, Yanshi Wang, Qing Da, Qing-Guo Chen, Bing Wang
SAM supports efficient training and real-time inference for user behavior sequences with lengths on the scale of thousands.
no code implementations • 29 Sep 2021 • Qianying Lin, Wen-Ji Zhou, Yanshi Wang, Qing Da, Qing-Guo Chen, Bing Wang
Extensive empirical studies show that our method outperforms various state-of-the-art sequential modeling methods on both public and industrial datasets for long sequential user behavior modeling.
no code implementations • 30 Jul 2020 • He Huang, Yuanwei Chen, Wei Tang, Wenhao Zheng, Qing-Guo Chen, Yao Hu, Philip Yu
On the other hand, there is a large semantic gap between seen and unseen classes in the existing multi-label classification datasets.
no code implementations • 8 Jan 2020 • Shu-Ting Shi, Wenhao Zheng, Jun Tang, Qing-Guo Chen, Yao Hu, Jianke Zhu, Ming Li
Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation.