no code implementations • CVPR 2023 • Zongheng Tang, Yifan Sun, Si Liu, Yi Yang
Second, through our design, the object queries and the foreground query in the decoder share consensus on the class semantics, therefore making the strong and weak supervision mutually benefit each other for domain alignment.
1 code implementation • 10 Nov 2020 • Zongheng Tang, Yue Liao, Si Liu, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu
HC-STVG is a video grounding task that requires both spatial (where) and temporal (when) localization.