1 code implementation • 21 Mar 2024 • Weipeng Deng, Runyu Ding, Jihan Yang, Jiahui Liu, Yijiang Li, Xiaojuan Qi, Edith Ngai
To test the language understandability of 3D-VL models, we first propose a language robustness task for systematically assessing 3D-VL models across various tasks, benchmarking their performance when presented with different language style variants.
1 code implementation • 5 Feb 2024 • Jihan Yang, Runyu Ding, Ellis Brown, Xiaojuan Qi, Saining Xie
There is a sensory gulf between the Earth that humans inhabit and the digital realms in which modern AI agents are created.
no code implementations • 1 Aug 2023 • Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi
To address this challenge, we propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for multi-view images of 3D scenes.
Ranked #3 on 3D Open-Vocabulary Instance Segmentation on S3DIS
3D Open-Vocabulary Instance Segmentation Instance Segmentation +4
no code implementations • 3 Apr 2023 • Jihan Yang, Runyu Ding, Zhe Wang, Xiaojuan Qi
Existing 3D scene understanding tasks have achieved high performance on close-set benchmarks but fail to handle novel categories in real-world applications.
1 code implementation • CVPR 2023 • Runyu Ding, Jihan Yang, Chuhui Xue, Wenqing Zhang, Song Bai, Xiaojuan Qi
Open-vocabulary scene understanding aims to localize and recognize unseen categories beyond the annotated label space.
Ranked #2 on 3D Open-Vocabulary Instance Segmentation on S3DIS
3D Open-Vocabulary Instance Segmentation Contrastive Learning +4
1 code implementation • 30 May 2022 • Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi
Then, we build a benchmark to assess existing KD methods developed in the 2D domain for 3D object detection upon six well-constructed teacher-student pairs.
1 code implementation • 4 Apr 2022 • Runyu Ding, Jihan Yang, Li Jiang, Xiaojuan Qi
Deep learning approaches achieve prominent success in 3D semantic segmentation.
2 code implementations • CVPR 2021 • Mutian Xu, Runyu Ding, Hengshuang Zhao, Xiaojuan Qi
The key of PAConv is to construct the convolution kernel by dynamically assembling basic weight matrices stored in Weight Bank, where the coefficients of these weight matrices are self-adaptively learned from point positions through ScoreNet.
Ranked #2 on Point Cloud Segmentation on PointCloud-C