1 code implementation • 27 Feb 2024 • Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Conghui He, Dahua Lin, Jiaqi Wang
We present SongComposer, an innovative LLM designed for song composition.
1 code implementation • 29 Nov 2023 • Shuangrui Ding, Rui Qian, Haohang Xu, Dahua Lin, Hongkai Xiong
In this paper, we propose a simple yet effective approach for self-supervised video object segmentation (VOS).
1 code implementation • 26 Sep 2023 • Pan Zhang, Xiaoyi Dong, Bin Wang, Yuhang Cao, Chao Xu, Linke Ouyang, Zhiyuan Zhao, Haodong Duan, Songyang Zhang, Shuangrui Ding, Wenwei Zhang, Hang Yan, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang
We propose InternLM-XComposer, a vision-language large model that enables advanced image-text comprehension and composition.
Ranked #9 on Visual Question Answering (VQA) on InfiMM-Eval
1 code implementation • ICCV 2023 • Rui Qian, Shuangrui Ding, Xian Liu, Dahua Lin
In the second stage, for each semantics, we randomly sample slots from the corresponding Gaussian distribution and perform masked feature aggregation within the semantic area to exploit temporal correspondence patterns for instance identification.
1 code implementation • ICCV 2023 • Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian
Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.
no code implementations • 1 Oct 2022 • Shuangrui Ding, Weidi Xie, Yabo Chen, Rui Qian, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
In this paper, we consider the task of unsupervised object discovery in videos.
Ranked #3 on Unsupervised Object Segmentation on DAVIS 2016
1 code implementation • 26 Jul 2022 • Rui Qian, Shuangrui Ding, Xian Liu, Dahua Lin
In this paper, we propose a novel learning scheme for self-supervised video representation learning.
no code implementations • 12 Jul 2022 • Shuangrui Ding, Rui Qian, Hongkai Xiong
In this way, the static scene and the dynamic motion are simultaneously encoded into the compact RGB representation.
1 code implementation • 10 Jun 2022 • Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.
1 code implementation • CVPR 2022 • Shuangrui Ding, Maomao Li, Tianyu Yang, Rui Qian, Haohang Xu, Qingyi Chen, Jue Wang, Hongkai Xiong
To alleviate such bias, we propose \textbf{F}oreground-b\textbf{a}ckground \textbf{Me}rging (FAME) to deliberately compose the moving foreground region of the selected video onto the static background of others.
1 code implementation • ICCV 2021 • Rui Qian, Yuxi Li, Huabin Liu, John See, Shuangrui Ding, Xian Liu, Dian Li, Weiyao Lin
The crux of self-supervised video representation learning is to build general features from unlabeled videos.
2 code implementations • NeurIPS 2020 • Jiaqi Ma, Shuangrui Ding, Qiaozhu Mei
Our theoretical and empirical analyses suggest that there is a discrepancy between the loss and mis-classification rate, as the latter presents a diminishing-return pattern when the number of attacked nodes increases.