no code implementations • 23 Apr 2024 • Hongyu Chen, Yiqi Gao, Min Zhou, Peng Wang, Xubin Li, Tiezheng Ge, Bo Zheng
Meanwhile, a network, dubbed as Masked ControlNet, is designed to utilize these object masks for object generation in the misaligned visual control region.
no code implementations • 4 Oct 2023 • Lingru Zhou, Yiqi Gao, Manqing Zhang, Peng Wu, Peng Wang, Yanning Zhang
To address this challenge, we construct a human-centric video surveillance captioning dataset, which provides detailed descriptions of the dynamic behaviors of 7, 820 individuals.
no code implementations • CVPR 2023 • Wei Suo, Mengyang Sun, Weisong Liu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu
VQA Natural Language Explanation (VQA-NLE) task aims to explain the decision-making process of VQA models in natural language.
1 code implementation • ECCV 2022 2022 • Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu
Text-based person search aims to associate pedestrian images with natural language descriptions.
Ranked #8 on Text based Person Retrieval on ICFG-PEDES
no code implementations • 6 May 2022 • Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang
As for the latter, \textbf{\textit{"couple"}} means treating the generation of visual semantic and syntax-related words equally.
no code implementations • 27 Apr 2022 • Yiqi Gao, Xinglin Hou, Yuanmeng Zhang, Tiezheng Ge, Yuning Jiang, Peng Wang
Existing image captioning systems are dedicated to generating narrative captions for images, which are spatially detached from the image in presentation.