no code implementations • 7 Apr 2024 • Yimu Wang, Shuai Yuan, Xiangru Jian, Wei Pang, Mushi Wang, Ning Yu
While recent progress in video-text retrieval has been driven by the exploration of powerful model architectures and training strategies, the representation learning ability of video-text retrieval models is still limited due to low-quality and scarce training data annotations.
no code implementations • 16 Feb 2024 • Yimu Wang, He Zhao, Ruizhi Deng, Frederick Tung, Greg Mori
Pretext training followed by task-specific fine-tuning has been a successful approach in vision and language domains.
1 code implementation • 20 Oct 2023 • Xiangru Jian, Yimu Wang
However, a recent study shows that multi-modal data representations tend to cluster within a limited convex cone (as representation degeneration problem), which hinders retrieval performance due to the inseparability of these representations.
1 code implementation • 17 Oct 2023 • Yimu Wang, Xiangru Jian, Bo Xue
In this work, we present a post-processing solution to address the hubness problem in cross-modal retrieval, a phenomenon where a small number of gallery data points are frequently retrieved, resulting in a decline in retrieval performance.
no code implementations • 5 Sep 2023 • TaeHoon Kim, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Mark Marsden, Alessandra Sala, Seung Hwan Kim, Bohyung Han, Kyoung Mu Lee, Honglak Lee, Kyounghoon Bae, Xiangyu Wu, Yi Gao, Hailiang Zhang, Yang Yang, Weili Guo, Jianfeng Lu, Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim, Wooyoung Kang, Won Young Jhoo, Byungseok Roh, Jonghwan Mun, Solgil Oh, Kenan Emir Ak, Gwang-Gook Lee, Yan Xu, Mingwei Shen, Kyomin Hwang, Wonsik Shin, Kamin Lee, Wonhark Park, Dongkwan Lee, Nojun Kwak, Yujin Wang, Yimu Wang, Tiancheng Gu, Xingchang Lv, Mingmao Sun
In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge.
no code implementations • 24 Jul 2023 • Yimu Wang, Peng Shi, Hongyang Zhang
Furthermore, to show the transferability of obstinate word substitutions found by GradObstinate, we replace the words in four representative NLP benchmarks with their obstinate substitutions.
no code implementations • CVPR 2023 • Yimu Wang, Dinghuai Zhang, Yihan Wu, Heng Huang, Hongyang Zhang
We identify a phenomenon named player domination in the bargaining game, namely that the existing max-based approaches, such as MAX and MSD, do not converge.
1 code implementation • 19 Feb 2023 • Yimu Wang, Peng Shi
While recent progress in video-text retrieval has been advanced by the exploration of better representation learning, in this paper, we present a novel multi-grained sparse learning framework, S3MA, to learn an aligned sparse space shared between the video and the text for video-text retrieval.
Ranked #16 on Video Retrieval on MSR-VTT-1kA
1 code implementation • 17 Feb 2023 • Qiying Yu, Yang Liu, Yimu Wang, Ke Xu, Jingjing Liu
In this work, we propose Contrastive Representation Ensemble and Aggregation for Multimodal FL (CreamFL), a multimodal federated learning framework that enables training larger server models from clients with heterogeneous model architectures and data modalities, while only communicating knowledge on public dataset.
no code implementations • 16 Feb 2022 • Yimu Wang, Kun Yu, Yan Wang, Hui Xue
In this paper, to extract a better feature for advancing the performance, we propose a novel method, namely multi-view fusion transformer (MVFT) along with a novel attention mechanism.
no code implementations • 28 Apr 2020 • Bo Xue, Guanghui Wang, Yimu Wang, Lijun Zhang
In this paper, we study the problem of stochastic linear bandits with finite action sets.