Search Results for author: Can Zhang

Found 25 papers, 11 papers with code

Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly

1 code implementation • 30 Apr 2024 • Hang Du, Sicheng Zhang, Binzhu Xie, Guoshun Nan, Jiayang Zhang, Junrui Xu, Hangyu Liu, Sicong Leng, Jiangming Liu, Hehe Fan, Dajiu Huang, Jing Feng, Linli Chen, Can Zhang, Xuhuan Li, Hao Zhang, Jianhang Chen, Qimei Cui, Xiaofeng Tao

In pursuit of these answers, we present a comprehensive benchmark for Causation Understanding of Video Anomaly (CUVA).

Anomaly Detection

Paper
Code

MLP-AMDC: An MLP Architecture for Adaptive-Mask-based Dual-Camera snapshot hyperspectral imaging

1 code implementation • 12 Oct 2023 • Zeyu Cai, Can Zhang, Xunhao Chen, Shanghuan Liu, Chengqian Jin, Feipeng Da

In order to improve the inference speed of the reconstruction network, this paper proposes An MLP Architecture for Adaptive-Mask-based Dual-Camera (MLP-AMDC) to replace the transformer structure of the network.

Paper
Code

GeT: Generative Target Structure Debiasing for Domain Adaptation

no code implementations • ICCV 2023 • Can Zhang, Gim Hee Lee

Despite the competitive performance, these pseudo labeling methods rely heavily on the source domain to generate pseudo labels for the target domain and therefore still suffer considerably from source data bias.

Domain Adaptation Pseudo Label

Paper
Add Code

Improving Scene Graph Generation with Superpixel-Based Interaction Learning

no code implementations • 4 Aug 2023 • Jingyi Wang, Can Zhang, Jinfa Huang, Botao Ren, Zhidong Deng

(ii) We explore intra-entity and cross-entity interactions among the superpixels to enrich fine-grained interactions between entities at an earlier stage.

Graph Generation Scene Graph Generation +1

Paper
Add Code

3D-IDS: Doubly Disentangled Dynamic Intrusion Detection

no code implementations • 2 Jul 2023 • Chenyang Qiu, Yingsheng Geng, Junrui Lu, Kaida Chen, Shitong Zhu, Ya Su, Guoshun Nan, Can Zhang, Junsong Fu, Qimei Cui, Xiaofeng Tao

This motivates us to propose 3D-IDS, a novel method that aims to tackle the above issues through two-step feature disentanglements and a dynamic graph diffusion scheme.

Intrusion Detection

Paper
Add Code

Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs

1 code implementation • 15 May 2023 • Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng

In this paper, we propose a Time-variant Relation-aware TRansformer (TR$^2$), which aims to model the temporal change of relations in dynamic scene graphs.

Relation Scene Graph Generation +1

Paper
Code

Indeterminate Probability Neural Network

1 code implementation • 21 Mar 2023 • Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang

Besides, for our proposed neural network framework, the output of neural network is defined as probability events, and based on the statistical analysis of these events, the inference model for classification task is deduced.

Classification

Paper
Code

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

no code implementations • CVPR 2023 • Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang

Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.

Sentence Video Grounding

Paper
Add Code

Unsupervised Feature Representation Learning for Domain-generalized Cross-domain Image Retrieval

1 code implementation • ICCV 2023 • Conghui Hu, Can Zhang, Gim Hee Lee

This limitation motivates us to present the first attempt at domain-generalized unsupervised cross-domain image retrieval (DG-UCDIR) aiming at facilitating image retrieval between any two unseen domains in an unsupervised way.

Contrastive Learning Image Retrieval +2

Paper
Code

LocVTP: Video-Text Pre-training for Temporal Localization

1 code implementation • 21 Jul 2022 • Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou

To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.

Retrieval Temporal Localization +1

Paper
Code

CA-UDA: Class-Aware Unsupervised Domain Adaptation with Optimal Assignment and Pseudo-Label Refinement

no code implementations • 26 May 2022 • Can Zhang, Gim Hee Lee

However, source domain bias that deteriorates the pseudo-labels can still exist since the shared network of the source and target domains are typically used for the pseudo-label selections.

Image Classification Missing Labels +2

Paper
Add Code

SpatioTemporal Focus for Skeleton-based Action Recognition

no code implementations • 31 Mar 2022 • Liyu Wu, Can Zhang, Yuexian Zou

Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts.

Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Unsupervised Pre-training for Temporal Action Localization Tasks

1 code implementation • CVPR 2022 • Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou

These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.

Contrastive Learning Representation Learning +4

Paper
Code

MISS: Multi-Interest Self-Supervised Learning Framework for Click-Through Rate Prediction

no code implementations • 30 Nov 2021 • Wei Guo, Can Zhang, ZhiCheng He, Jiarui Qin, Huifeng Guo, Bo Chen, Ruiming Tang, Xiuqiang He, Rui Zhang

With the help of two novel CNN-based multi-interest extractors, self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item).

Click-Through Rate Prediction Contrastive Learning +3

Paper
Add Code

On Pursuit of Designing Multi-modal Transformer for Video Grounding

no code implementations • EMNLP 2021 • Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou

Almost all existing video grounding methods fall into two frameworks: 1) Top-down model: It predefines a set of segment candidates and then conducts segment classification and regression.

Decoder Sentence +1

Paper
Add Code

Deep Motion Prior for Weakly-Supervised Temporal Action Localization

no code implementations • 12 Aug 2021 • Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou

In this paper, we analyze that the motion cues behind the optical flow features are complementary informative.

Optical Flow Estimation Weakly-supervised Temporal Action Localization +1

Paper
Add Code

Long-Short Temporal Modeling for Efficient Action Recognition

no code implementations • 30 Jun 2021 • Liyu Wu, Yuexian Zou, Can Zhang

Efficient long-short temporal modeling is key for enhancing the performance of action recognition task.

Action Recognition

Paper
Add Code

SRF-Net: Selective Receptive Field Network for Anchor-Free Temporal Action Detection

no code implementations • 29 Jun 2021 • Ranyu Ning, Can Zhang, Yuexian Zou

Current mainstream one-stage TAD approaches localize and classify action proposals relying on pre-defined anchors, where the location and scale for action instances are set by designers.

Action Detection

Paper
Add Code

All You Need is a Second Look: Towards Arbitrary-Shaped Text Detection

no code implementations • 24 Jun 2021 • Meng Cao, Can Zhang, Dongming Yang, Yuexian Zou

Compared to the traditional single-stage segmentation network, our NASK conducts the detection in a coarse-to-fine manner with the first stage segmentation spotting the rectangle text proposals and the second one retrieving compact representations.

Instance Segmentation Segmentation +2

Paper
Add Code

RR-Net: Injecting Interactive Semantics in Human-Object Interaction Detection

no code implementations • 30 Apr 2021 • Dongming Yang, Yuexian Zou, Can Zhang, Meng Cao, Jie Chen

Upon the frame, an Interaction Intensifier Module and a Correlation Parsing Module are carefully designed, where: a) interactive semantics from humans can be exploited and passed to objects to intensify interactions, b) interactive correlations among humans, objects and interactions are integrated to promote predictions.

Human-Object Interaction Detection Relation

Paper
Add Code

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

1 code implementation • CVPR 2021 • Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou

In this paper, we argue that learning by comparing helps identify these hard snippets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short.

Ranked #4 on Weakly Supervised Action Localization on ActivityNet-1.2

CoLA Contrastive Learning +3

Paper
Code

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

no code implementations • 12 Dec 2020 • Can Zhang, Hong Liu, Wei Guo, Mang Ye

RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.

Person Re-Identification

Paper
Add Code

PAN: Towards Fast Action Recognition via Learning Persistence of Appearance

2 code implementations • 8 Aug 2020 • Can Zhang, Yuexian Zou, Guang Chen, Lei Gan

In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.

Ranked #2 on Action Recognition on Jester (Gesture Recognition)

Action Recognition Optical Flow Estimation +1

103

Paper
Code

Non-Autoregressive Coarse-to-Fine Video Captioning

1 code implementation • 27 Nov 2019 • Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

However, mainstream video captioning methods suffer from slow inference speed due to the sequential manner of autoregressive decoding, and prefer generating generic descriptions due to the insufficient training of visual words (e. g., nouns and verbs) and inadequate decoding paradigm.

Sentence Video Captioning

Paper
Code

Learning Representations for Predicting Future Activities

1 code implementation • 9 May 2019 • Mohammadreza Zolfaghari, Özgün Çiçek, Syed Mohsin Ali, Farzaneh Mahdisoltani, Can Zhang, Thomas Brox

Foreseeing the future is one of the key factors of intelligence.

Future prediction

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.