1 code implementation • 5 Feb 2024 • Yiyuan Zhang, Yuhao Kang, Zhixin Zhang, Xiaohan Ding, Sanyuan Zhao, Xiangyu Yue
We introduce $\textit{InteractiveVideo}$, a user-centric framework for video generation.
no code implementations • 6 Jun 2023 • Yukun Zhai, Xiaoqiang Zhang, Xiameng Qin, Sanyuan Zhao, Xingping Dong, Jianbing Shen
End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework.
no code implementations • 8 Feb 2023 • Jiawei Liu, Xingping Dong, Sanyuan Zhao, Jianbing Shen
To achieve simultaneous detection for both common and rare objects, we propose a novel task, called generalized few-shot 3D object detection, where we have a large amount of training data for common (base) objects, but only a few data for rare (novel) classes.
1 code implementation • 14 Dec 2021 • JianJian Cao, Xiameng Qin, Sanyuan Zhao, Jianbing Shen
In this paper, we focus on these two problems and propose a Graph Matching Attention (GMA) network.
no code implementations • ICCV 2021 • Xin Hao, Sanyuan Zhao, Mang Ye, Jianbing Shen
Cross-modality person re-identification is a challenging task due to large cross-modality discrepancy and intra-modality variations.
Cross-Modality Person Re-identification Person Re-Identification
no code implementations • CVPR 2020 • Tao Li, Zhiyuan Liang, Sanyuan Zhao, Jiahao Gong, Jianbing Shen
For the global error, we first transform category-wise features into a high-level graph model with coarse-grained structural information, and then decouple the high-level graph to reconstruct the category features.
1 code implementation • ECCV 2018 • Hongmei Song, Wenguan Wang, Sanyuan Zhao, Jianbing Shen, Kin-Man Lam
This paper proposes a fast video salient object detection model, based on a novel recurrent network architecture, named Pyramid Dilated Bidirectional ConvLSTM (PDB-ConvLSTM).
Ranked #1 on Video Salient Object Detection on UVSD (using extra training data)
no code implementations • 28 Jul 2017 • Weilin Cong, Sanyuan Zhao, Hui Tian, Jianbing Shen
Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression.