Search Results for author: Chang Wen Chen

Found 38 papers, 11 papers with code

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

1 code implementation • 2 Apr 2024 • Ye Liu, Jixuan He, Wanhua Li, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen

Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to ground relevant clips in untrimmed videos given natural language queries.

Highlight Detection Moment Retrieval +4

Paper
Code

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

1 code implementation • 31 Mar 2024 • Ye Liu, Jixuan He, Wanhua Li, Junsik Kim, Donglai Wei, Hanspeter Pfister, Chang Wen Chen

Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to ground relevant clips in untrimmed videos given natural language queries.

Ranked #2 on Highlight Detection on QVHighlights

Highlight Detection Moment Retrieval +4

Paper
Code

SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

no code implementations • 28 Mar 2024 • Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang

Autonomous driving progress relies on large-scale annotated datasets.

Autonomous Driving

Paper
Add Code

SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer

no code implementations • 25 Mar 2024 • Rui Zhu, Yingwei Pan, Yehao Li, Ting Yao, Zhenglong Sun, Tao Mei, Chang Wen Chen

Despite this progress, mask strategy still suffers from two inherent limitations: (a) training-inference discrepancy and (b) fuzzy relations between mask reconstruction & generative diffusion process, resulting in sub-optimal training of DiT.

Decoder Image Generation

Paper
Add Code

Detecting Abrupt Change of Channel Covariance Matrix in IRS-Assisted Communication

no code implementations • 26 Oct 2023 • Runnan Liu, Liang Liu, Yin Xu, Dazhi He, Wenjun Zhang, Chang Wen Chen

We first categorize two types of channel covariance matrix changes based on their impact on system design: Type I change, which denotes the change in the BS receive covariance matrix, and Type II change, which denotes the change in the IRS transmit/receive covariance matrix.

Paper
Add Code

Power Optimization in Multi-IRS Aided Delay-Constrained IoVT Systems

no code implementations • 5 Oct 2023 • Baolin Chong, Hancheng Lu, Langtian Qin, Chenwu Zhang, Jiasen Li, Chang Wen Chen

However, the extensive transmission of video data in IoVT poses challenges in terms of delay and power consumption.

Paper
Add Code

Bridging the Gap: Fine-to-Coarse Sketch Interpolation Network for High-Quality Animation Sketch Inbetweening

no code implementations • 25 Aug 2023 • Jiaming Shen, Kun Hu, Wei Bao, Chang Wen Chen, Zhiyong Wang

The 2D animation workflow is typically initiated with the creation of keyframes using sketch-based drawing.

Paper
Add Code

Tackling Scattering and Reflective Flare in Mobile Camera Systems: A Raw Image Dataset for Enhanced Flare Removal

no code implementations • 26 Jul 2023 • Fengbo Lan, Chang Wen Chen

The increasing prevalence of mobile devices has led to significant advancements in mobile camera systems and improved image quality.

Flare Removal

Paper
Add Code

GLA-GCN: Global-local Adaptive Graph Convolutional Network for 3D Human Pose Estimation from Monocular Video

1 code implementation • ICCV 2023 • Bruce X. B. Yu, Zhi Zhang, Yongxu Liu, Sheng-hua Zhong, Yan Liu, Chang Wen Chen

3D human pose lifting is one of the promising research directions toward the task where both estimated pose and ground truth pose data are used for training.

Ranked #1 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation

Paper
Code

Visual Tuning

no code implementations • 10 May 2023 • Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen

With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer.

Paper
Add Code

End-to-End Personalized Next Location Recommendation via Contrastive User Preference Modeling

no code implementations • 22 Mar 2023 • Yan Luo, Ye Liu, Fu-Lai Chung, Yu Liu, Chang Wen Chen

History encoder is designed to model mobility patterns from historical check-in sequences, while query generator explicitly learns user preferences to generate user-specific intention queries.

Decoder

Paper
Add Code

Interventional Bag Multi-Instance Learning On Whole-Slide Pathological Images

1 code implementation • CVPR 2023 • Tiancheng Lin, Zhimiao Yu, Hongyu Hu, Yi Xu, Chang Wen Chen

This deficiency is a confounder that limits the performance of existing MIL methods.

Paper
Code

Statistical QoS Provisioning Analysis and Performance Optimization in xURLLC-enabled Massive MU-MIMO Networks: A Stochastic Network Calculus Perspective

no code implementations • 20 Feb 2023 • Yuang Chen, Hancheng Lu, Langtian Qin, Chenwu Zhang, Chang Wen Chen

In this paper, fundamentals and performance tradeoffs of the neXt-generation ultra-reliable and low-latency communication (xURLLC) are investigated from the perspective of stochastic network calculus (SNC).

Novel Concepts

Paper
Add Code

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

1 code implementation • CVPR 2023 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen

During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.

Language Modelling Zero-Shot Learning

Paper
Code

Towards a Unified View on Visual Parameter-Efficient Transfer Learning

1 code implementation • 3 Oct 2022 • Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen

Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.

Action Recognition Image Classification +2

Paper
Code

Learned Video Compression via Heterogeneous Deformable Compensation Network

no code implementations • 11 Jul 2022 • Huairui Wang, Zhenzhong Chen, Chang Wen Chen

In this paper, we propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance caused by single-size deformable kernels in downsampled feature domain.

Motion Compensation Optical Flow Estimation +1

Paper
Add Code

UMT: Unified Multi-modal Transformers for Joint Video Moment Retrieval and Highlight Detection

1 code implementation • CVPR 2022 • Ye Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, XiaoHu Qie

Finding relevant moments and highlights in videos according to natural language queries is a natural and highly valuable common need in the current video content explosion era.

Ranked #3 on Highlight Detection on YouTube Highlights

Decoder Highlight Detection +4

180

Paper
Code

Taking an Emotional Look at Video Paragraph Captioning

no code implementations • 12 Mar 2022 • Qinyu Li, Tengpeng Li, Hanli Wang, Chang Wen Chen

In this work, a comprehensive study is conducted on video paragraph captioning, with the goal to generate paragraph-level descriptions for a given video.

Image Captioning

Paper
Add Code

Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling

no code implementations • 10 Mar 2022 • Tengpeng Li, Hanli Wang, Bin He, Chang Wen Chen

Third, a unified one-stage story generation model with encoder-decoder structure is proposed to simultaneously train and infer the knowledge-enriched attention network, group-wise semantic module and multi-modal story generation decoder in an end-to-end fashion.

Decoder Visual Storytelling

Paper
Add Code

Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images

1 code implementation • 22 Nov 2021 • Ye Liu, Huifang Li, Chao Hu, Shuang Luo, Yan Luo, Chang Wen Chen

The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid (SCP), and hierarchical region of interest extractor (HRoIE), to aggregate global visual context at feature, spatial, and instance domains, respectively.

Instance Segmentation Object Detection

Paper
Code

Optimized Separable Convolution: Yet Another Efficient Convolution Operator

no code implementations • 29 Sep 2021 • Tao Wei, Yonghong Tian, YaoWei Wang, Yun Liang, Chang Wen Chen

In this research, we propose a novel and principled operator called optimized separable convolution by optimal design for the internal number of groups and kernel sizes for general separable convolutions can achieve the complexity of O(C^{\frac{3}{2}}K).

Paper
Add Code

Improving Contrastive Learning by Visualizing Feature Transformation

1 code implementation • ICCV 2021 • Rui Zhu, Bingchen Zhao, Jingen Liu, Zhenglong Sun, Chang Wen Chen

To our knowledge, this is the first attempt of its kind.

Contrastive Learning Data Augmentation +2

Paper
Code

Beyond Fine-tuning: Classifying High Resolution Mammograms using Function-Preserving Transformations

no code implementations • 20 Jan 2021 • Tao Wei, Angelica I Aviles-Rivero, Shuo Wang, Yuan Huang, Fiona J Gilbert, Carola-Bibiane Schönlieb, Chang Wen Chen

The current state-of-the-art approaches for medical image classification rely on using the de-facto method for ConvNets - fine-tuning.

Ranked #3 on Cancer-no cancer per image classification on CBIS-DDSM

Cancer-no cancer per image classification Image Classification +3

Paper
Add Code

Rethinking Convolution: Towards an Optimal Efficiency

no code implementations • 1 Jan 2021 • Tao Wei, Yonghong Tian, Chang Wen Chen

In this research, we propose a novel operator called \emph{optimal separable convolution} which can be calculated at $O(C^{\frac{3}{2}}KHW)$ by optimal design for the internal number of groups and kernel sizes for general separable convolutions.

Computational Efficiency

Paper
Add Code

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

2 code implementations • 14 Aug 2020 • Ye Liu, Junsong Yuan, Chang Wen Chen

We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of <human, action, object> in images.

Ranked #3 on Zero-Shot Human-Object Interaction Detection on HICO-DET

Human-Object Interaction Detection Object +1

Paper
Code

Fusing Motion Patterns and Key Visual Information for Semantic Event Recognition in Basketball Videos

no code implementations • 13 Jul 2020 • Lifang Wu, Zhou Yang, Qi. Wang, Meng Jian, Boxuan Zhao, Junchi Yan, Chang Wen Chen

Based on the observations, we propose a scheme to fuse global and local motion patterns (MPs) and key visual information (KVI) for semantic event recognition in basketball videos.

Group Activity Recognition Optical Flow Estimation

Paper
Add Code

From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation

no code implementations • 6 Jun 2019 • Yu Liu, Li Deng, Jianshu Chen, Chang Wen Chen

To remove the need for the parallel training corpora has practical significance for real-world applications, and it is one of the main goals of unsupervised learning.

Binary Classification General Classification +4

Paper
Add Code

AVT: Unsupervised Learning of Transformation Equivariant Representations by Autoencoding Variational Transformations

1 code implementation • ICCV 2019 • Guo-Jun Qi, Liheng Zhang, Chang Wen Chen, Qi Tian

This ensures the resultant TERs of individual images contain the {\em intrinsic} information about their visual structures that would equivary {\em extricably} under various transformations in a generalized {\em nonlinear} case.

Decoder

Paper
Code

Ontology Based Global and Collective Motion Patterns for Event Classification in Basketball Videos

no code implementations • 16 Mar 2019 • Lifang Wu, Zhou Yang, Jiaoyu He, Meng Jian, Yaowen Xu, Dezhong Xu, Chang Wen Chen

Therefore, a semantic event in broadcast basketball videos is closely related to both the global motion (camera motion) and the collective motion.

Classification General Classification +1

Paper
Add Code

DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks

no code implementations • CVPR 2018 • Shuang Ma, Jianlong Fu, Chang Wen Chen, Tao Mei

Specifically, we jointly learn a deep attention encoder, and the instance-level correspondences could be consequently discovered through attending on the learned instances.

Data Augmentation Deep Attention +2

Paper
Add Code

DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials)

no code implementations • CVPR 2018 • Shuang Ma, Jianlong Fu, Chang Wen Chen, Tao Mei

Specifically, we jointly learn a deep attention encoder, and the instancelevel correspondences could be consequently discovered through attending on the learned instance pairs.

Data Augmentation Deep Attention +2

Paper
Add Code

Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images

no code implementations • 19 Jan 2018 • Jing Zhang, Yang Cao, Yang Wang, Chenglin Wen, Chang Wen Chen

Specifically, we propose to randomly shuffle the pixels in the origin images and leverage the shuffled image as input to make CNN more concerned with the statistical properties.

Color Constancy Image Dehazing

Paper
Add Code

Network Iterative Learning for Dynamic Deep Neural Networks via Morphism

no code implementations • ICLR 2018 • Tao Wei, Changhu Wang, Chang Wen Chen

In this research, we present a novel learning scheme called network iterative learning for deep neural networks.

Paper
Add Code

Fast Haze Removal for Nighttime Image Using Maximum Reflectance Prior

no code implementations • CVPR 2017 • Jing Zhang, Yang Cao, Shuai Fang, Yu Kang, Chang Wen Chen

Then, we propose a simple but effective image prior, maximum reflectance prior, to estimate the varying ambient illumination.

Computational Efficiency

Paper
Add Code

A-Lamp: Adaptive Layout-Aware Multi-Patch Deep Convolutional Neural Network for Photo Aesthetic Assessment

no code implementations • CVPR 2017 • Shuang Ma, Jing Liu, Chang Wen Chen

However, the performance of these deep CNN methods is often compromised by the constraint that the neural network only takes the fixed-size input.

Ranked #2 on Aesthetics Quality Assessment on AVA

Aesthetics Quality Assessment

Paper
Add Code

Modularized Morphing of Neural Networks

no code implementations • 12 Jan 2017 • Tao Wei, Changhu Wang, Chang Wen Chen

Different from existing work where basic morphing types on the layer level were addressed, we target at the central problem of network morphism at a higher level, i. e., how a convolutional layer can be morphed into an arbitrary module of a neural network.

MORPH

Paper
Add Code

Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network

no code implementations • 2 Jun 2016 • Yu Liu, Jianlong Fu, Tao Mei, Chang Wen Chen

Second, by using sGRU as basic units, the BMRNN is trained to align the local storylines into the global sequential timeline.

Video Captioning Visual Storytelling

Paper
Add Code

Network Morphism

no code implementations • 5 Mar 2016 • Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

The second requirement for this network morphism is its ability to deal with non-linearity in a network.

MORPH

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.