no code implementations • 3 Apr 2024 • Xu Wang, YiFan Li, Qiudan Zhang, Wenhui Wu, Mark Junjie Li, Jianmin Jinag
However, previous 3D scene graph generation methods utilize a fully supervised learning manner and require a large amount of entity-level annotation data of objects and relations, which is extremely resource-consuming and tedious to obtain.
no code implementations • 15 Dec 2023 • Xiaoxu Xu, Yitian Yuan, Qiudan Zhang, Wenhui Wu, Zequn Jie, Lin Ma, Xu Wang
During the inference stage, the learned text-3D correspondence will help us ground the text queries to the 3D target objects even without 2D images.
no code implementations • CVPR 2019 • Qiudan Zhang, Xu Wang, Shiqi Wang, Shikai Li, Sam Kwong, Jianmin Jiang
Finally, a Convolutional Long Short-Term Memory (Conv-LSTM) based fusion network is developed to model the instantaneous interactions between spatio-temporal and depth attributes, such that the ultimate stereoscopic saliency maps over time are produced.