no code implementations • 4 Jun 2024 • Jifei Luo, Hantao Yao, Changsheng Xu
However, existing techniques that construct the affinity graph based on pairwise instances can lead to the propagation of misinformation from outliers and other manifolds, resulting in inaccurate results.
1 code implementation • 24 May 2024 • Hantao Yao, Rui Zhang, Lu Yu, Changsheng Xu
Comprehensive evaluations across various benchmarks and tasks confirm SEP's efficacy in prompt tuning.
1 code implementation • 16 May 2024 • Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu
Specifically, we incorporate a routed visual expert with a cross-modal bridge module into a pretrained LLM to route the vision and language flows during attention computing to enable different attention patterns in inner-modal modeling and cross-modal interaction scenarios.
2 code implementations • 20 Apr 2024 • Linhui Xiao, Xiaoshan Yang, Fang Peng, YaoWei Wang, Changsheng Xu
Specifically, HiVG consists of a multi-layer adaptive cross-modal bridge and a hierarchical multimodal low-rank adaptation (Hi LoRA) paradigm.
2 code implementations • 9 Apr 2024 • Ming Tao, Bing-Kun Bao, Hao Tang, YaoWei Wang, Changsheng Xu
3) The story visualization and continuation models are trained and inferred independently, which is not user-friendly.
1 code implementation • 25 Jan 2024 • Nisha Huang, WeiMing Dong, Yuxin Zhang, Fan Tang, Ronghui Li, Chongyang Ma, Xiu Li, Changsheng Xu
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
no code implementations • 21 Jan 2024 • Yukun Zuo, Hantao Yao, Lu Yu, Liansheng Zhuang, Changsheng Xu
Nonetheless, these learnable prompts tend to concentrate on the discriminatory knowledge of the current task while ignoring past task knowledge, leading to that learnable prompts still suffering from catastrophic forgetting.
1 code implementation • 11 Jan 2024 • Yukun Zuo, Hantao Yao, Liansheng Zhuang, Changsheng Xu
We introduce Hierarchical Augmentation and Distillation (HAD), which comprises the Hierarchical Augmentation Module (HAM) and Hierarchical Distillation Module (HDM) to efficiently utilize the hierarchical structure of data and models, respectively.
1 code implementation • 25 Dec 2023 • Chengcheng Ma, Ismail Elezi, Jiankang Deng, WeiMing Dong, Changsheng Xu
For instance, on CIFAR-10-LT, CPE improves test accuracy by over 2. 22% compared to baselines.
1 code implementation • 13 Dec 2023 • Shengsheng Qian, Yifei Wang, Dizhan Xue, Shengjie Zhang, Huaiwen Zhang, Changsheng Xu
After obtaining the threat model trained on the poisoned dataset, our method can precisely detect poisonous samples based on the assumption that masking the backdoor trigger can effectively change the activation of a downstream clustering model.
1 code implementation • 8 Dec 2023 • Yuxin Zhang, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
The essence of a video lies in its dynamic motions, including character actions, object movements, and camera movements.
1 code implementation • 30 Nov 2023 • Hantao Yao, Rui Zhang, Changsheng Xu
However, those textual tokens have a limited generalization ability regarding unseen domains, as they cannot dynamically adjust to the distribution of testing classes.
1 code implementation • 22 Nov 2023 • Junyu Gao, Xuan Yao, Changsheng Xu
Such agents are typically required to execute user instructions in an online manner, leading us to explore the use of unlabeled test samples for effective online model adaptation.
no code implementations • 12 Oct 2023 • Junyu Gao, Xinhong Ma, Changsheng Xu
Despite the great progress of unsupervised domain adaptation (UDA) with the deep neural networks, current UDA models are opaque and cannot provide promising explanations, limiting their applications in the scenarios that require safe and controllable model decisions.
1 code implementation • 5 Sep 2023 • Dizhan Xue, Shengsheng Qian, Zuyi Zhou, Changsheng Xu
In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.
no code implementations • 30 Aug 2023 • Yifan Xu, Mengdan Zhang, Xiaoshan Yang, Changsheng Xu
In this paper, we for the first time explore helpful multi-modal contextual knowledge to understand novel categories for open-vocabulary object detection (OVD).
no code implementations • 13 Jul 2023 • Jiaming Zhang, Jitao Sang, Qi Yi, Changsheng Xu
Harnessing the concept of non-robust features, we elaborate on two guiding principles for surrogate model selection to explain why the foundational model is an optimal choice for this role.
no code implementations • 5 Jul 2023 • Jie Fu, Junyu Gao, Changsheng Xu
In this paper, to balance the feature learning processes of different modalities, a dynamic gradient modulation (DGM) mechanism is explored, where a novel and effective metric function is designed to measure the imbalanced feature learning between audio and visual modalities.
1 code implementation • NeurIPS 2023 • Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu
To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.
Ranked #2 on Few-Shot Object Detection on ODinW-35
1 code implementation • 25 May 2023 • Hantao Yao, Lu Yu, Jifei Luo, Changsheng Xu
In this paper, we propose a novel Identity Knowledge Evolution (IKE) framework for CIOR, consisting of the Identity Knowledge Association (IKA), Identity Knowledge Distillation (IKD), and Identity Knowledge Update (IKU).
3 code implementations • 25 May 2023 • Yuxin Zhang, WeiMing Dong, Fan Tang, Nisha Huang, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Oliver Deussen, Changsheng Xu
We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout, achieving previously unattainable results from a single image input without fine-tuning the diffusion models.
2 code implementations • 15 May 2023 • Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, YaoWei Wang, Changsheng Xu
In order to utilize vision and language pre-trained models to address the grounding problem, and reasonably take advantage of pseudo-labels, we propose CLIP-VG, a novel method that can conduct self-paced curriculum adapting of CLIP with pseudo-language labels.
no code implementations • WWW 2023 • Shengsheng Qian, Hong Chen, Dizhan Xue, Quan Fang, Changsheng Xu
To tackle these challenges, we propose an Open-World Social Event Classifier (OWSEC) model in this paper.
1 code implementation • CVPR 2023 • Hantao Yao, Rui Zhang, Changsheng Xu
Representative CoOp-based work combines the learnable textual tokens with the class tokens to obtain specific textual knowledge.
1 code implementation • 9 Mar 2023 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer.
no code implementations • 1 Mar 2023 • Shangxi Wu, Qiuyang He, Fangzhao Wu, Jitao Sang, YaoWei Wang, Changsheng Xu
In this work, we found that the backdoor attack can construct an artificial bias similar to the model bias derived in standard training.
1 code implementation • 23 Feb 2023 • Nisha Huang, Fan Tang, WeiMing Dong, Tong-Yee Lee, Changsheng Xu
Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of interest and replace it following given text prompts.
2 code implementations • CVPR 2023 • Ming Tao, Bing-Kun Bao, Hao Tang, Changsheng Xu
The complex scene understanding ability of CLIP enables the discriminator to accurately assess the image quality.
Ranked #3 on Text-to-Image Generation on CUB
1 code implementation • CVPR 2023 • Junyu Gao, Mengyuan Chen, Changsheng Xu
We argue that, for an event residing in one modality, the modality itself should provide ample presence evidence of this event, while the other complementary modality is encouraged to afford the absence evidence as a reference signal.
1 code implementation • ICCV 2023 • Dizhan Xue, Shengsheng Qian, Changsheng Xu
To address these issues, we propose a Variational Causal Inference Network (VCIN) that establishes the causal correlation between predicted answers and explanations, and captures cross-modal relationships to generate rational explanations.
Ranked #1 on Explanatory Visual Question Answering on GQA-REX
Explanation Generation Explanatory Visual Question Answering +2
no code implementations • CVPR 2023 • Sisi You, Hantao Yao, Bing-Kun Bao, Changsheng Xu
Recently, Multiple Object Tracking has achieved great success, which consists of object detection, feature embedding, and identity association.
no code implementations • CVPR 2023 • Mengyuan Chen, Junyu Gao, Changsheng Xu
Targeting at recognizing and localizing action instances with only video-level labels during training, Weakly-supervised Temporal Action Localization (WTAL) has achieved significant progress in recent years.
Open Set Learning Weakly-supervised Temporal Action Localization +1
1 code implementation • CVPR 2023 • Xi Zhang, Feifei Zhang, Changsheng Xu
Research on continual learning has recently led to a variety of work in unimodal community, however little attention has been paid to multimodal tasks like visual question answering (VQA).
no code implementations • CVPR 2023 • Yuyang Wanyan, Xiaoshan Yang, Chaofan Chen, Changsheng Xu
In meta-training, we design an Active Sample Selection (ASS) module to organize query samples with large differences in the reliability of modalities into different groups based on modality-specific posterior distributions.
1 code implementation • CVPR 2023 • Jiaming Zhang, Xingjun Ma, Qi Yi, Jitao Sang, Yu-Gang Jiang, YaoWei Wang, Changsheng Xu
Furthermore, we propose to leverage VisionandLanguage Pre-trained Models (VLPMs) like CLIP as the surrogate model to improve the transferability of the crafted UCs to diverse domains.
no code implementations • 28 Nov 2022 • Fang Peng, Xiaoshan Yang, Linhui Xiao, YaoWei Wang, Changsheng Xu
Although significant progress has been made in few-shot learning, most of existing few-shot image classification methods require supervised pre-training on a large amount of samples of base classes, which limits their generalization ability in real world application.
1 code implementation • CVPR 2023 • Yuxin Zhang, Nisha Huang, Fan Tang, Haibin Huang, Chongyang Ma, WeiMing Dong, Changsheng Xu
Our key idea is to learn artistic style directly from a single painting and then guide the synthesis without providing complex textual descriptions.
1 code implementation • 19 Nov 2022 • Nisha Huang, Yuxin Zhang, Fan Tang, Chongyang Ma, Haibin Huang, Yong Zhang, WeiMing Dong, Changsheng Xu
Despite the impressive results of arbitrary image-guided style transfer methods, text-driven image stylization has recently been proposed for transferring a natural image into a stylized one according to textual descriptions of the target style provided by the user.
1 code implementation • 4 Nov 2022 • Chengcheng Ma, Yang Liu, Jiankang Deng, Lingxi Xie, WeiMing Dong, Changsheng Xu
Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.
1 code implementation • ACM MM 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu
Finally, a multimodal transformer decoder constructs attention among multimodal features to learn the story dependency and generates informative, reasonable, and coherent story endings.
Ranked #1 on Image-guided Story Ending Generation on LSMDC-E
1 code implementation • 27 Sep 2022 • Nisha Huang, Fan Tang, WeiMing Dong, Changsheng Xu
Extensive experimental results on the quality and quantity of the generated digital art paintings confirm the effectiveness of the combination of the diffusion model and multimodal guidance.
1 code implementation • IEEE Transactions on Pattern Analysis and Machine Intelligence 2022 • Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu
To date, most of the existing techniques mainly convert multimodal data into a common representation space where similarities in semantics between samples can be easily measured across multiple modalities.
no code implementations • 22 May 2022 • Yufan Hu, Junyu Gao, Changsheng Xu
Most existing state-of-the-art video classification methods assume that the training data obey a uniform distribution.
1 code implementation • 19 May 2022 • Yuxin Zhang, Fan Tang, WeiMing Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu
Our framework consists of three key components, i. e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
2 code implementations • 5 Apr 2022 • Jun Hu, Bryan Hooi, Shengsheng Qian, Quan Fang, Changsheng Xu
Based on a Markov process that trades off two types of distances, we present Markov Graph Diffusion Collaborative Filtering (MGDCF) to generalize some state-of-the-art GNN-based CF models.
Ranked #4 on Recommendation Systems on Gowalla
1 code implementation • 4 Apr 2022 • Ziyue Wu, Junyu Gao, Shucheng Huang, Changsheng Xu
Then, a commonsense-aware interaction module is designed to obtain bridged visual and text features by utilizing the learned commonsense concepts.
1 code implementation • CVPR 2022 • Junyu Gao, Mengyuan Chen, Changsheng Xu
We target at the task of weakly-supervised action localization (WSAL), where only video-level action labels are available during model training.
1 code implementation • 26 Jan 2022 • Chengcheng Ma, Xingjia Pan, Qixiang Ye, Fan Tang, WeiMing Dong, Changsheng Xu
Semi-supervised object detection has recently achieved substantial progress.
no code implementations • CVPR 2022 • Yiming Li, Xiaoshan Yang, Changsheng Xu
Humans can not only see the collection of objects in visual scenes, but also identify the relationship between objects.
3 code implementations • CVPR 2022 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
1 code implementation • 9 Dec 2021 • Hantao Yao, Changsheng Xu
Unlike the individual-based updating mechanism, the centroid-based updating mechanism that applies the mean feature of each cluster to update the cluster memory can reduce the impact of individual samples.
Ranked #50 on Person Re-Identification on Market-1501
no code implementations • 5 Dec 2021 • Pei Lv, Wentong Wang, Yunxin Wang, Yuzhen Zhang, Mingliang Xu, Changsheng Xu
In detail, when modeling social interaction, we propose a new \emph{social soft attention function}, which fully considers various interaction factors among pedestrians.
1 code implementation • 2 Dec 2021 • Jun Hu, Shengsheng Qian, Quan Fang, Changsheng Xu
Recently the field has advanced from local propagation schemes that focus on local neighbors towards extended propagation schemes that can directly deal with extended neighbors consisting of both local and high-order neighbors.
no code implementations • 1 Dec 2021 • Wei Wang, Junyu Gao, Changsheng Xu
With this in mind, we design a unified causal framework to learn the deconfounded object-relevant association for more accurate and robust video object grounding.
1 code implementation • 19 Nov 2021 • Desheng Cai, Jun Hu, Quan Zhao, Shengsheng Qian, Quan Fang, Changsheng Xu
In this paper, we present GRecX, an open-source TensorFlow framework for benchmarking GNN-based recommendation models in an efficient and unified way.
no code implementations • 18 Oct 2021 • Xiaowen Huang, Jitao Sang, Jian Yu, Changsheng Xu
The cold-start recommendation is an urgent problem in contemporary online applications.
no code implementations • 29 Sep 2021 • Guanhua Zheng, Jitao Sang, Wang Haonan, Changsheng Xu
Recently, backpropagation(BP)-based feature attribution methods have been widely adopted to interpret the internal mechanisms of convolutional neural networks (CNNs), and expected to be human-understandable (lucidity) and faithful to decision-making processes (fidelity).
1 code implementation • IEEE Transactions on Multimedia 2021 • Shengsheng Qian, Dizhan Xue, Quan Fang, Changsheng Xu
Firstly, we construct an instance representation learning branch to transform instances of different modalities into a common representation space.
1 code implementation • 3 Aug 2021 • Yifan Xu, Zhijie Zhang, Mengdan Zhang, Kekai Sheng, Ke Li, WeiMing Dong, Liqing Zhang, Changsheng Xu, Xing Sun
Vision transformers (ViTs) have recently received explosive popularity, but the huge computational cost is still a severe issue.
Ranked #11 on Efficient ViTs on ImageNet-1K (with DeiT-T)
1 code implementation • 10 Jul 2021 • Jianyu Wang, Bing-Kun Bao, Changsheng Xu
However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two properties of VideoQA: (1) Even for the same video, different questions may require different amount of video clips or objects to infer the answer with relational reasoning; (2) During reasoning, appearance and motion features have complicated interdependence which are correlated and complementary to each other.
Ranked #29 on Visual Question Answering (VQA) on MSRVTT-QA
no code implementations • CVPR 2021 • Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma
Specifically, we first employ the comparison module to explore the pairwise sample relations to learn rich sample representations in the instance-level graph.
no code implementations • 14 Jun 2021 • Pei Lv, Jianqi Fan, Xixi Nie, WeiMing Dong, Xiaoheng Jiang, Bing Zhou, Mingliang Xu, Changsheng Xu
This framework leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL), and generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.
4 code implementations • 30 May 2021 • Yingying Deng, Fan Tang, WeiMing Dong, Chongyang Ma, Xingjia Pan, Lei Wang, Changsheng Xu
The goal of image style transfer is to render an image with artistic features guided by a style reference while maintaining the original content.
1 code implementation • AAAI 2021 • Shengsheng Qian, Dizhan Xue, Huaiwen Zhang, Quan Fang, Changsheng Xu
To date, most existing methods transform multimodal data into a common representation space where semantic similarities between items can be directly measured across different modalities.
no code implementations • 21 Apr 2021 • Yifan Xu, Kekai Sheng, WeiMing Dong, Baoyuan Wu, Changsheng Xu, Bao-Gang Hu
However, due to unpredictable corruptions (e. g., noise and blur) in real data like web images, domain adaptation methods are increasingly required to be corruption robust on target domains.
no code implementations • 23 Mar 2021 • Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu
However, these data streams are multi-source and heterogeneous, containing complex temporal structures with local contextual and global temporal aspects, which makes the feature learning and data joint utilization challenging.
1 code implementation • CVPR 2021 • Xingjia Pan, Yingguo Gao, Zhiwen Lin, Fan Tang, WeiMing Dong, Haolei Yuan, Feiyue Huang, Changsheng Xu
Weakly supervised object localization(WSOL) remains an open problem given the deficiency of finding object extent information using a classification network.
1 code implementation • 27 Jan 2021 • Jun Hu, Shengsheng Qian, Quan Fang, Youze Wang, Quan Zhao, Huaiwen Zhang, Changsheng Xu
We introduce tf_geometric, an efficient and friendly library for graph deep learning, which is compatible with both TensorFlow 1. x and 2. x.
no code implementations • ICCV 2021 • Junyu Gao, Changsheng Xu
To tackle this issue, we replace the cross-modal interaction module with a cross-modal common space, in which moment-query alignment is learned and efficient moment search can be performed.
no code implementations • ICCV 2021 • Xinhong Ma, Junyu Gao, Changsheng Xu
This paper proposes a new paradigm for unsupervised domain adaptation, termed as Active Universal Domain Adaptation (AUDA), which removes all label set assumptions and aims for not only recognizing target samples from source classes but also inferring those from target-private classes by using active learning to annotate a small budget of target data.
no code implementations • 4 Dec 2020 • Zhiyong Huang, Kekai Sheng, WeiMing Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu
For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain.
no code implementations • 17 Sep 2020 • Yingying Deng, Fan Tang, Wei-Ming Dong, Haibin Huang, Chongyang Ma, Changsheng Xu
Towards this end, we propose Multi-Channel Correction network (MCCNet), which can be trained to fuse the exemplar style features and input content features for efficient style transfer while naturally maintaining the coherence of input videos.
3 code implementations • CVPR 2022 • Ming Tao, Hao Tang, Fei Wu, Xiao-Yuan Jing, Bing-Kun Bao, Changsheng Xu
To these ends, we propose a simpler but more effective Deep Fusion Generative Adversarial Networks (DF-GAN).
Ranked #4 on Text-to-Image Generation on CUB (Inception score metric)
no code implementations • 18 Jun 2020 • Guanhua Zheng, Jitao Sang, Changsheng Xu
Since the basic assumption of conventional manifold learning fails in case of sparse and uneven data distribution, we introduce a new target, Minimum Manifold Coding (MMC), for manifold learning to encourage simple and unfolded manifold.
no code implementations • 2 Jun 2020 • Minxuan Lin, Fan Tang, Wei-Ming Dong, Xiao Li, Chongyang Ma, Changsheng Xu
Currently, there are few methods that can perform both multimodal and multi-domain stylization simultaneously.
no code implementations • 31 May 2020 • Hantao Yao, Shaobo Min, Yongdong Zhang, Changsheng Xu
Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories.
no code implementations • 30 May 2020 • Hantao Yao, Changsheng Xu
Based on this repulsion constraint, the repulsion term is proposed to reduce the similarity of distractor images that are not most similar to the probe person.
2 code implementations • 27 May 2020 • Yingying Deng, Fan Tang, Wei-Ming Dong, Wen Sun, Feiyue Huang, Changsheng Xu
Arbitrary style transfer is a significant topic with research value and application prospect.
no code implementations • 25 May 2020 • Shangxi Wu, Jitao Sang, Kaiyuan Xu, Guanhua Zheng, Changsheng Xu
Specifically, AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features, and an adaptive sample weighting module by setting sample-specific training weights to balance between logits pairing loss and classification loss.
1 code implementation • CVPR 2020 • Xingjia Pan, Yuqiang Ren, Kekai Sheng, Wei-Ming Dong, Haolei Yuan, Xiaowei Guo, Chongyang Ma, Changsheng Xu
However, the detection of oriented and densely packed objects remains challenging because of following inherent reasons: (1) receptive fields of neurons are all axis-aligned and of the same shape, whereas objects are usually of diverse shapes and align along various directions; (2) detection models are typically trained with generic knowledge and may not generalize well to handle specific objects at test time; (3) the limited dataset hinders the development on this task.
no code implementations • 26 Feb 2020 • Minxuan Lin, Yingying Deng, Fan Tang, Wei-Ming Dong, Changsheng Xu
Controllable painting generation plays a pivotal role in image stylization.
no code implementations • 28 Nov 2019 • Yi Huang, Xiaoshan Yang, Changsheng Xu
(1) It can model longitudinal heterogeneous EHRs data via capturing the 3-order correlations of different modalities and the irregular temporal impact of historical events.
no code implementations • 28 Nov 2019 • Guanhua Zheng, Jitao Sang, Houqiang Li, Jian Yu, Changsheng Xu
The derived generalization bound based on the ITID assumption identifies the significance of hypothesis invariance in guaranteeing generalization performance.
1 code implementation • Proceedings of the AAAI Conference on Artificial Intelligence 2019 • Junyu Gao, Tianzhu Zhang, Changsheng Xu
To effectively leverage the knowledge graph, we design a novel Two-Stream Graph Convolutional Network (TS-GCN) consisting of a classifier branch and an instance branch.
Ranked #5 on Zero-Shot Action Recognition on Olympics
no code implementations • 24 Jun 2019 • Zhaoquan Yuan, Siyuan Sun, Lixin Duan, Xiao Wu, Changsheng Xu
In AMN, as inspired by generative adversarial networks, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e. g., subtitles and questions).
no code implementations • 25 May 2019 • Ting-Ting Xie, Xiaoshan Yang, Tianzhu Zhang, Changsheng Xu, Ioannis Patras
Temporal action localization has recently attracted significant interest in the Computer Vision community.
no code implementations • CVPR 2018 • Feifei Zhang, Tianzhu Zhang, Qirong Mao, Changsheng Xu
First, the encoder-decoder structure of the generator can learn a generative and discriminative identity representation for face images.
no code implementations • 3 Mar 2018 • Mingliang Xu, Zhaoyang Ge, Xiaoheng Jiang, Gaoge Cui, Pei Lv, Bing Zhou, Changsheng Xu
DigCrowd first uses the depth information of an image to segment the scene into a far-view region and a near-view region.
no code implementations • ICLR 2018 • Guanhua Zheng, Jitao Sang, Changsheng Xu
DNN is then regarded as approximating the feature conditions with multilayer feature learning, and proved to be a recursive solution towards maximum entropy principle.
no code implementations • CVPR 2017 • Tianzhu Zhang, Changsheng Xu, Ming-Hsuan Yang
In this paper, we propose a multi-task correlation particle filter (MCPF) for robust visual tracking.
no code implementations • CVPR 2016 • Si Liu, Tianzhu Zhang, Xiaochun Cao, Changsheng Xu
In this paper, we propose a novel structural correlation filter (SCF) model for robust visual tracking.
no code implementations • CVPR 2015 • Tianzhu Zhang, Si Liu, Changsheng Xu, Shuicheng Yan, Bernard Ghanem, Narendra Ahuja, Ming-Hsuan Yang
Sparse representation has been applied to visual tracking by finding the best target candidate with minimal reconstruction error by use of target templates.
no code implementations • CVPR 2015 • Si Liu, Xiaodan Liang, Luoqi Liu, Xiaohui Shen, Jianchao Yang, Changsheng Xu, Liang Lin, Xiaochun Cao, Shuicheng Yan
Under the classic K Nearest Neighbor (KNN)-based nonparametric framework, the parametric Matching Convolutional Neural Network (M-CNN) is proposed to predict the matching confidence and displacements of the best matched region in the testing image for a particular semantic region in one KNN image.
no code implementations • CVPR 2014 • Tianzhu Zhang, Kui Jia, Changsheng Xu, Yi Ma, Narendra Ahuja
The proposed part matching tracker (PMT) has a number of attractive properties.