1 code implementation • 30 Apr 2024 • Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, Jing Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao
In pursuit of these answers, we present a comprehensive benchmark for Causation Understanding of Video Anomaly (CUVA).
1 code implementation • 12 Oct 2023 • Zeyu Cai, Can Zhang, Xunhao Chen, Shanghuan Liu, Chengqian Jin, Feipeng Da
In order to improve the inference speed of the reconstruction network, this paper proposes An MLP Architecture for Adaptive-Mask-based Dual-Camera (MLP-AMDC) to replace the transformer structure of the network.
no code implementations • ICCV 2023 • Can Zhang, Gim Hee Lee
Despite the competitive performance, these pseudo labeling methods rely heavily on the source domain to generate pseudo labels for the target domain and therefore still suffer considerably from source data bias.
no code implementations • 4 Aug 2023 • Jingyi Wang, Can Zhang, Jinfa Huang, Botao Ren, Zhidong Deng
(ii) We explore intra-entity and cross-entity interactions among the superpixels to enrich fine-grained interactions between entities at an earlier stage.
no code implementations • 2 Jul 2023 • Chenyang Qiu, Yingsheng Geng, Junrui Lu, Kaida Chen, Shitong Zhu, Ya Su, Guoshun Nan, Can Zhang, Junsong Fu, Qimei Cui, Xiaofeng Tao
This motivates us to propose 3D-IDS, a novel method that aims to tackle the above issues through two-step feature disentanglements and a dynamic graph diffusion scheme.
1 code implementation • 15 May 2023 • Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng
In this paper, we propose a Time-variant Relation-aware TRansformer (TR$^2$), which aims to model the temporal change of relations in dynamic scene graphs.
1 code implementation • 21 Mar 2023 • Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang
Besides, for our proposed neural network framework, the output of neural network is defined as probability events, and based on the statistical analysis of these events, the inference model for classification task is deduced.
no code implementations • CVPR 2023 • Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang
Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.
1 code implementation • ICCV 2023 • Conghui Hu, Can Zhang, Gim Hee Lee
This limitation motivates us to present the first attempt at domain-generalized unsupervised cross-domain image retrieval (DG-UCDIR) aiming at facilitating image retrieval between any two unseen domains in an unsupervised way.
1 code implementation • 21 Jul 2022 • Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou
To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.
no code implementations • 26 May 2022 • Can Zhang, Gim Hee Lee
However, source domain bias that deteriorates the pseudo-labels can still exist since the shared network of the source and target domains are typically used for the pseudo-label selections.
no code implementations • 31 Mar 2022 • Liyu Wu, Can Zhang, Yuexian Zou
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts.
1 code implementation • CVPR 2022 • Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou
These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.
no code implementations • 30 Nov 2021 • Wei Guo, Can Zhang, ZhiCheng He, Jiarui Qin, Huifeng Guo, Bo Chen, Ruiming Tang, Xiuqiang He, Rui Zhang
With the help of two novel CNN-based multi-interest extractors, self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item).
no code implementations • EMNLP 2021 • Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou
Almost all existing video grounding methods fall into two frameworks: 1) Top-down model: It predefines a set of segment candidates and then conducts segment classification and regression.
no code implementations • 12 Aug 2021 • Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou
In this paper, we analyze that the motion cues behind the optical flow features are complementary informative.
Optical Flow Estimation Weakly-supervised Temporal Action Localization +1
no code implementations • 30 Jun 2021 • Liyu Wu, Yuexian Zou, Can Zhang
Efficient long-short temporal modeling is key for enhancing the performance of action recognition task.
no code implementations • 29 Jun 2021 • Ranyu Ning, Can Zhang, Yuexian Zou
Current mainstream one-stage TAD approaches localize and classify action proposals relying on pre-defined anchors, where the location and scale for action instances are set by designers.
no code implementations • 24 Jun 2021 • Meng Cao, Can Zhang, Dongming Yang, Yuexian Zou
Compared to the traditional single-stage segmentation network, our NASK conducts the detection in a coarse-to-fine manner with the first stage segmentation spotting the rectangle text proposals and the second one retrieving compact representations.
no code implementations • 30 Apr 2021 • Dongming Yang, Yuexian Zou, Can Zhang, Meng Cao, Jie Chen
Upon the frame, an Interaction Intensifier Module and a Correlation Parsing Module are carefully designed, where: a) interactive semantics from humans can be exploited and passed to objects to intensify interactions, b) interactive correlations among humans, objects and interactions are integrated to promote predictions.
1 code implementation • CVPR 2021 • Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou
In this paper, we argue that learning by comparing helps identify these hard snippets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short.
no code implementations • 12 Dec 2020 • Can Zhang, Hong Liu, Wei Guo, Mang Ye
RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.
2 code implementations • 8 Aug 2020 • Can Zhang, Yuexian Zou, Guang Chen, Lei Gan
In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
Ranked #2 on Action Recognition on Jester (Gesture Recognition)
1 code implementation • 27 Nov 2019 • Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang
However, mainstream video captioning methods suffer from slow inference speed due to the sequential manner of autoregressive decoding, and prefer generating generic descriptions due to the insufficient training of visual words (e. g., nouns and verbs) and inadequate decoding paradigm.
1 code implementation • 9 May 2019 • Mohammadreza Zolfaghari, Özgün Çiçek, Syed Mohsin Ali, Farzaneh Mahdisoltani, Can Zhang, Thomas Brox
Foreseeing the future is one of the key factors of intelligence.