Search Results for author: Qingming Huang

Found 146 papers, 88 papers with code

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision

no code implementations • ECCV 2020 • Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian

In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.

Question Answering Visual Question Answering +1

Paper
Add Code

Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations

no code implementations • ECCV 2020 • Yifan Yang, Guorong Li, Zhe Wu, Li Su, Qingming Huang, Nicu Sebe

We propose a soft-label sorting network along with the counting network, which sorts the given images by their crowd numbers.

Crowd Counting

Paper
Add Code

Context-aware Difference Distilling for Multi-change Captioning

no code implementations • 31 May 2024 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang

Given an image pair, CARD first decouples context features that aggregate all similar/dissimilar semantics, termed common/difference context features.

Decoder

Paper
Add Code

Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

1 code implementation • 16 May 2024 • Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang

This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image.

Object object-detection +2

Paper
Code

ReconBoost: Boosting Can Achieve Modality Reconcilement

1 code implementation • 15 May 2024 • Cong Hua, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang

This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions.

Paper
Code

Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition

1 code implementation • 13 May 2024 • Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang

Traditional methods predominantly use a Mixture-of-Expert (MoE) approach, targeting a few fixed test label distributions that exhibit substantial global variations.

Ranked #1 on Test Agnostic Long-Tailed Learning on iNaturalist 2018

Image Classification Long-tail Learning +1

Paper
Code

Retrieval Enhanced Zero-Shot Video Captioning

no code implementations • 11 May 2024 • Yunchuan Ma, Laiyun Qing, Guorong Li, Yuankai Qi, Quan Z. Sheng, Qingming Huang

Despite the significant progress of fully-supervised video captioning, zero-shot methods remain much less explored.

Retrieval Test-time Adaptation +3

Paper
Add Code

Uncertainty-boosted Robust Video Activity Anticipation

1 code implementation • 29 Apr 2024 • Zhaobo Qi, Shuhui Wang, Weigang Zhang, Qingming Huang

Video activity anticipation aims to predict what will happen in the future, embracing a broad application prospect ranging from robot vision and autonomous driving.

Autonomous Driving

Paper
Code

A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification

no code implementations • 27 Mar 2024 • Jiaqi Wu, Junbiao Pang, Baochang Zhang, Qingming Huang

Semi-supervised learning (SSL) is a practical challenge in computer vision.

Pseudo Label

Paper
Add Code

A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes

no code implementations • 12 Mar 2024 • Ting Yu, Xiaojun Lin, Shuhui Wang, Weiguo Sheng, Qingming Huang, Jun Yu

Three-Dimensional (3D) dense captioning is an emerging vision-language bridging task that aims to generate multiple detailed and accurate descriptions for 3D scenes.

3D dense captioning Dense Captioning

Paper
Add Code

Query-guided Prototype Evolution Network for Few-Shot Segmentation

no code implementations • 11 Mar 2024 • Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao

To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.

Segmentation

Paper
Add Code

StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing

no code implementations • 20 Feb 2024 • Gaoxiang Cong, Yuankai Qi, Liang Li, Amin Beheshti, Zhedong Zhang, Anton Van Den Hengel, Ming-Hsuan Yang, Chenggang Yan, Qingming Huang

It contains three main components: (1) A multimodal style adaptor operating at the phoneme level to learn pronunciation style from the reference audio, and generate intermediate representations informed by the facial emotion presented in the video; (2) An utterance-level style learning module, which guides both the mel-spectrogram decoding and the refining processes from the intermediate embeddings to improve the overall style expression; And (3) a phoneme-guided lip aligner to maintain lip sync.

Voice Cloning

Paper
Add Code

Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization

no code implementations • 30 Jan 2024 • Henglei Lv, Jiayu Xiao, Liang Li, Qingming Huang

To this end, we propose Pick-and-Draw, a training-free semantic guidance approach to boost identity consistency and generative diversity for personalization methods.

Paper
Add Code

Bias-Conflict Sample Synthesis and Adversarial Removal Debias Strategy for Temporal Sentence Grounding in Video

1 code implementation • 15 Jan 2024 • Zhaobo Qi, Yibo Yuan, Xiaowen Ruan, Shuhui Wang, Weigang Zhang, Qingming Huang

Temporal Sentence Grounding in Video (TSGV) is troubled by dataset bias issue, which is caused by the uneven temporal distribution of the target moments for samples with similar semantic components in input videos or query texts.

Sentence Temporal Sentence Grounding

Paper
Code

ADA-GAD: Anomaly-Denoised Autoencoders for Graph Anomaly Detection

1 code implementation • 22 Dec 2023 • Junwei He, Qianqian Xu, Yangbangyan Jiang, Zitai Wang, Qingming Huang

We pretrain graph autoencoders on these augmented graphs at multiple levels, which enables the graph autoencoders to capture normal patterns.

Fraud Detection Graph Anomaly Detection

Paper
Code

Subject-Oriented Video Captioning

no code implementations • 20 Dec 2023 • Yunchuan Ma, Chang Teng, Yuankai Qi, Guorong Li, Laiyu Qing, Qi Wu, Qingming Huang

To address this problem, we propose a new video captioning task, subject-oriented video captioning, which allows users to specify the describing target via a bounding box.

Video Captioning

Paper
Add Code

Weakly Supervised Video Individual CountingWeakly Supervised Video Individual Counting

1 code implementation • 10 Dec 2023 • Xinyan Liu, Guorong Li, Yuankai Qi, Ziheng Yan, Zhenjun Han, Anton Van Den Hengel, Ming-Hsuan Yang, Qingming Huang

% To provide a more realistic reflection of the underlying practical challenge, we introduce a weakly supervised VIC task, wherein trajectory labels are not provided.

Contrastive Learning Video Individual Counting

Paper
Code

Dynamic Erasing Network Based on Multi-Scale Temporal Features for Weakly Supervised Video Anomaly Detection

1 code implementation • 4 Dec 2023 • Chen Zhang, Guorong Li, Yuankai Qi, Hanhua Ye, Laiyun Qing, Ming-Hsuan Yang, Qingming Huang

To address these limitations, we propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection, which learns multi-scale temporal features.

Anomaly Detection Video Anomaly Detection

Paper
Code

DRAUC: An Instance-wise Distributionally Robust AUC Optimization Framework

1 code implementation • NeurIPS 2023 • Siran Dai, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

To tackle this challenge, methodically we propose an instance-wise surrogate loss of Distributionally Robust AUC (DRAUC) and build our optimization framework on top of it.

Paper
Code

Modeling the Uncertainty with Maximum Discrepant Students for Semi-supervised 2D Pose Estimation

no code implementations • 3 Nov 2023 • Jiaqi Wu, Junbiao Pang, Qingming Huang

Semi-supervised pose estimation is a practically challenging task for computer vision.

2D Pose Estimation Pose Estimation

Paper
Add Code

Generating Unbiased Pseudo-labels via a Theoretically Guaranteed Chebyshev Constraint to Unify Semi-supervised Classification and Regression

1 code implementation • 3 Nov 2023 • Jiaqi Wu, Junbiao Pang, Qingming Huang

Both semi-supervised classification and regression are practically challenging tasks for computer vision.

Classification Pose Estimation +2

Paper
Code

R&B: Region and Boundary Aware Zero-shot Grounded Text-to-image Generation

no code implementations • 13 Oct 2023 • Jiayu Xiao, Henglei Lv, Liang Li, Shuhui Wang, Qingming Huang

Recent text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images given text-prompts as input.

Text-to-Image Generation

Paper
Add Code

Towards Demystifying the Generalization Behaviors When Neural Collapse Emerges

no code implementations • 12 Oct 2023 • Peifeng Gao, Qianqian Xu, Yibo Yang, Peisong Wen, Huiyang Shao, Zhiyong Yang, Bernard Ghanem, Qingming Huang

While there have been extensive studies on optimization characteristics showing the global optimality of neural collapse, little research has been done on the generalization behaviors during the occurrence of NC.

Paper
Add Code

Open-Set Knowledge-Based Visual Question Answering with Inference Paths

1 code implementation • 12 Oct 2023 • Jingru Gan, Xinzhe Han, Shuhui Wang, Qingming Huang

Given an image and an associated textual question, the purpose of Knowledge-Based Visual Question Answering (KB-VQA) is to provide a correct answer to the question with the aid of external knowledge bases.

Knowledge Graphs Multi-class Classification +2

Paper
Code

A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment for Imbalanced Learning. paper with code

1 code implementation • 7 Oct 2023 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

However, existing generalization analysis of such losses is still coarse-grained and fragmented, failing to explain some empirical results.

Ranked #6 on Long-tail Learning on CIFAR-10-LT (ρ=10)

Long-tail Learning

Paper
Code

A Unified Generalization Analysis of Re-Weighting and Logit-Adjustment for Imbalanced Learning

1 code implementation • NeurIPS 2023 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

However, existing generalization analysis of such losses is still coarse-grained and fragmented, failing to explain some empirical results.

Paper
Code

Learning node representation via Motif Coarsening

1 code implementation • journal 2023 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang

For the motif-based node representation learning process, we propose a Motif Coarsening strategy for incorporating motif structure into the graph representation learning process.

Graph Representation Learning

Paper
Code

Self-supervised Cross-view Representation Reconstruction for Change Captioning

1 code implementation • ICCV 2023 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang

Change captioning aims to describe the difference between a pair of similar images.

Caption Generation Hallucination

Paper
Code

AUC-Oriented Domain Adaptation: From Theory to Algorithm

1 code implementation • TPAMI 2023 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Peisong Wen, Xiaochun Cao, Qingming Huang

We propose a new result that not only addresses the interdependency issue but also brings a much sharper bound with weaker assumptions about the loss function.

Disease Prediction Fraud Detection +1

Paper
Code

Revisiting AUC-oriented Adversarial Training with Loss-Agnostic Perturbations

2 code implementations • TPAMI 2023 • Zhiyong Yang, Qianqian Xu, Wenzheng Hou, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang

On top of this, we can show that: 1) Under mild conditions, AdAUC can be optimized equivalently with score-based or instance-wise-loss-based perturbations, which is compatible with most of the popular adversarial example generation methods.

144

Paper
Code

When Measures are Unreliable: Imperceptible Adversarial Perturbations toward Top-$k$ Multi-Label Learning

1 code implementation • 27 Jul 2023 • Yuchen Sun, Qianqian Xu, Zitai Wang, Qingming Huang

However, existing adversarial attacks toward multi-label learning only pursue the traditional visual imperceptibility but ignore the new perceptible problem coming from measures such as Precision@$k$ and mAP@$k$.

Adversarial Attack Multi-Label Learning

Paper
Code

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

1 code implementation • 15 Jun 2023 • Runmin Cong, Wenyu Yang, Wei zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.

Quantization UIE

Paper
Code

Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training

no code implementations • 13 May 2023 • Ke Zhang, Yan Yang, Jun Yu, Hanliang Jiang, Jianping Fan, Qingming Huang, Weidong Han

To address this limitation, we propose a unified Med-VLP framework based on Multi-task Paired Masking with Alignment (MPMA) to integrate the cross-modal alignment task into the joint image-text reconstruction framework to achieve more comprehensive cross-modal interaction, while a Global and Local Alignment (GLA) module is designed to assist self-supervised paradigm in obtaining semantic representations with rich domain knowledge.

Paper
Add Code

A Study of Neural Collapse Phenomenon: Grassmannian Frame, Symmetry and Generalization

no code implementations • 18 Apr 2023 • Peifeng Gao, Qianqian Xu, Peisong Wen, Huiyang Shao, Zhiyong Yang, Qingming Huang

Out of curiosity about the symmetry of Grassmannian Frame, we conduct experiments to explore if models with different Grassmannian Frames have different performance.

Paper
Add Code

Neighborhood Contrastive Transformer for Change Captioning

1 code implementation • 6 Mar 2023 • Yunbin Tu, Liang Li, Li Su, Ke Lu, Qingming Huang

Change captioning is to describe the semantic change between a pair of similar images in natural language.

Decoder Image Captioning

Paper
Code

Stable Attribute Group Editing for Reliable Few-shot Image Generation

1 code implementation • 1 Feb 2023 • Guanqi Ding, Xinzhe Han, Shuhui Wang, Xin Jin, Dandan Tu, Qingming Huang

SAGE takes use of all given few-shot images and estimates a class center embedding based on the category-relevant attribute dictionary.

Attribute Classification +1

Paper
Code

Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild

no code implementations • ICCV 2023 • Huiyang Shao, Qianqian Xu, Peisong Wen, Peifeng Gao, Zhiyong Yang, Qingming Huang

Finally, experimental results support the effectiveness of the proposed framework in terms of both mural synthesis and restoration.

Image Restoration

Paper
Add Code

Text-Driven Generative Domain Adaptation with Spectral Consistency Regularization

1 code implementation • ICCV 2023 • Zhenhuan Liu, Liang Li, Jiayu Xiao, Zheng-Jun Zha, Qingming Huang

The experiments demonstrate the effectiveness of our method to preserve the diversity of source domain and generate high fidelity target images.

Domain Adaptation

Paper
Code

Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image

no code implementations • 23 Dec 2022 • Runmin Cong, Ke Huang, Jianjun Lei, Yao Zhao, Qingming Huang, Sam Kwong

Salient object detection (SOD) aims to determine the most visually attractive objects in an image.

Decoder object-detection +2

Paper
Add Code

Learning to Dub Movies via Hierarchical Prosody Models

1 code implementation • CVPR 2023 • Gaoxiang Cong, Liang Li, Yuankai Qi, ZhengJun Zha, Qi Wu, Wenyu Wang, Bin Jiang, Ming-Hsuan Yang, Qingming Huang

Given a piece of text, a video clip and a reference audio, the movie dubbing (also known as visual voice clone V2C) task aims to generate speeches that match the speaker's emotion presented in the video using the desired speaker voice as reference.

Paper
Code

Consistency-Aware Anchor Pyramid Network for Crowd Localization

no code implementations • 8 Dec 2022 • Xinyan Liu, Guorong Li, Yuankai Qi, Zhenjun Han, Qingming Huang, Ming-Hsuan Yang, Nicu Sebe

Crowd localization aims to predict the spatial position of humans in a crowd scenario.

Position

Paper
Add Code

Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection

no code implementations • CVPR 2023 • Chen Zhang, Guorong Li, Yuankai Qi, Shuhui Wang, Laiyun Qing, Qingming Huang, Ming-Hsuan Yang

Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.

Anomaly Detection Pseudo Label +1

Paper
Add Code

Progressive Multi-resolution Loss for Crowd Counting

1 code implementation • 8 Dec 2022 • Ziheng Yan, Yuankai Qi, Guorong Li, Xinyan Liu, Weigang Zhang, Qingming Huang, Ming-Hsuan Yang

Crowd counting is usually handled in a density map regression fashion, which is supervised via a L2 loss between the predicted density map and ground truth.

Crowd Counting

Paper
Code

Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective

1 code implementation • CVPR 2022 • Yunrui Zhao, Qianqian Xu, Yangbangyan Jiang, Peisong Wen, Qingming Huang

Positive-Unlabeled (PU) learning tries to learn binary classifiers from a few labeled positive examples with many unlabeled ones.

Paper
Code

OpenAUC: Towards AUC-Oriented Open-Set Recognition

1 code implementation • 22 Oct 2022 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

In this paper, a systematic analysis reveals that most existing metrics are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics extended from close-set classification, such as Open-set F-score, Youden's index, and Normalized Accuracy, a poor open-set prediction can escape from a low performance score with a superior close-set prediction.

Novelty Detection Open Set Learning

Paper
Code

Towards Understanding and Boosting Adversarial Transferability from a Distribution Perspective

2 code implementations • 9 Oct 2022 • Yao Zhu, Yuefeng Chen, Xiaodan Li, Kejiang Chen, Yuan He, Xiang Tian, Bolun Zheng, Yaowu Chen, Qingming Huang

We conduct comprehensive transferable attacks against multiple DNNs to demonstrate the effectiveness of the proposed method.

313

Paper
Code

Does Thermal Really Always Matter for RGB-T Salient Object Detection?

2 code implementations • 9 Oct 2022 • Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong

In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.

object-detection Object Detection +2

Paper
Code

Asymptotically Unbiased Instance-wise Regularized Partial AUC Optimization: Theory and Algorithm

2 code implementations • NeurIPS 2022 • Huiyang Shao, Qianqian Xu, Zhiyong Yang, Shilong Bao, Qingming Huang

sample size and a slow convergence rate, especially for TPAUC.

Paper
Code

CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection

3 code implementations • 6 Oct 2022 • Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, Yao Zhao

Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.

Decoder object-detection +2

Paper
Code

A Unified Framework against Topology and Class Imbalance

2 code implementations • Proceedings of the 30th ACM International Conference on Multimedia 2022 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

We develop a multi-class AUC optimization work to deal with the class imbalance problem.

Graph Learning

Paper
Code

Recurrent Meta-Learning against Generalized Cold-start Problem in CTR Prediction

1 code implementation • Conference 2022 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang

To attack this problem, we propose a recursive meta-learning model with the user's behavior sequence prediction as a separate training task.

Click-Through Rate Prediction Meta-Learning

Paper
Code

The Minority Matters: A Diversity-Promoting Collaborative Metric Learning Algorithm

1 code implementation • NeurIPS 2023 • Shilong Bao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems (RS), closing the gap between metric learning and Collaborative Filtering.

Collaborative Filtering Metric Learning +1

Paper
Code

Exploring the Algorithm-Dependent Generalization of AUPRC Optimization with List Stability

1 code implementation • 27 Sep 2022 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang

Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.

Generalization Bounds Image Retrieval +2

Paper
Code

MaxMatch: Semi-Supervised Learning with Worst-Case Consistency

no code implementations • 26 Sep 2022 • Yangbangyan Jiang, Xiaodan Li, Yuefeng Chen, Yuan He, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

In recent years, great progress has been made to incorporate unlabeled data to overcome the inefficiently supervised problem via semi-supervised learning (SSL).

Paper
Add Code

A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game

1 code implementation • 13 Sep 2022 • Ke Ma, Qianqian Xu, Jinshan Zeng, Guorong Li, Xiaochun Cao, Qingming Huang

From the perspective of the dynamical system, the attack behavior with a target ranking list is a fixed point belonging to the composition of the adversary and the victim.

Information Retrieval Retrieval

Paper
Code

Optimizing Partial Area Under the Top-k Curve: Theory and Practice

1 code implementation • 3 Sep 2022 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang

Finally, the experimental results on four benchmark datasets validate the effectiveness of our proposed framework.

Paper
Code

Multi-Attention Network for Compressed Video Referring Object Segmentation

1 code implementation • 26 Jul 2022 • Weidong Chen, Dexiang Hong, Yuankai Qi, Zhenjun Han, Shuhui Wang, Laiyun Qing, Qingming Huang, Guorong Li

To address this problem, we propose a multi-attention network which consists of dual-path dual-attention module and a query-based cross-modal Transformer module.

Ranked #5 on Referring Expression Segmentation on A2D Sentences

Object Referring Expression Segmentation +4

Paper
Code

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang

Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.

Attribute Referring Expression +2

Paper
Code

Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction

no code implementations • 28 Jun 2022 • Tianwei Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang

In this paper, we regard user interest modeling as a feature selection problem, which we call user interest selection.

Bilevel Optimization Click-Through Rate Prediction +3

Paper
Add Code

ER: Equivariance Regularizer for Knowledge Graph Completion

1 code implementation • 24 Jun 2022 • Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang

To address this issue, we propose a new regularizer, namely, Equivariance Regularizer (ER), which can suppress overfitting by leveraging the implicit semantic information.

Knowledge Graph Completion

Paper
Code

AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems

no code implementations • ICML 2022 • Wenzheng Hou, Qianqian Xu, Zhiyong Yang, Shilong Bao, Yuan He, Qingming Huang

Our analysis differs from the existing studies since the algorithm is asked to generate adversarial examples by calculating the gradient of a min-max problem.

Paper
Add Code

Geometry Interaction Knowledge Graph Embeddings

1 code implementation • 24 Jun 2022 • Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

Knowledge graph (KG) embeddings have shown great power in learning representations of entities and relations for link prediction tasks.

Knowledge Graph Completion Knowledge Graph Embeddings +1

Paper
Code

Optimizing Two-way Partial AUC with an End-to-end Framework

1 code implementation • TPAMI 2022 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang

The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss.

Vocal Bursts Valence Prediction

144

Paper
Code

Rethinking Collaborative Metric Learning: Toward an Efficient Alternative without Negative Sampling

no code implementations • TPAMI 2022 • Shilong Bao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

However, in this work, by taking a theoretical analysis, we find that negative sampling would lead to a biased estimation of the generalization error.

Metric Learning Recommendation Systems

Paper
Add Code

Automatic Relation-aware Graph Network Proliferation

1 code implementation • CVPR 2022 • Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-Jun Zha, Qingming Huang

However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information.

Ranked #2 on Link Prediction on TSP/HCP Benchmark set

Graph Classification Graph Learning +5

Paper
Code

Global-and-Local Collaborative Learning for Co-Salient Object Detection

2 code implementations • 19 Apr 2022 • Runmin Cong, Ning Yang, Chongyi Li, Huazhu Fu, Yao Zhao, Qingming Huang, Sam Kwong

In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture comprehensive inter-image corresponding relationship among different images from the global and local perspectives.

8k Co-Salient Object Detection +2

Paper
Code

CenterNet++ for Object Detection

2 code implementations • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).

Ranked #35 on Object Detection on COCO test-dev

Object object-detection +1

179

Paper
Code

IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning

no code implementations • 2 Apr 2022 • Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang

Conditional image generation is an active research topic including text2image and image translation.

Conditional Image Generation Generative Adversarial Network +1

Paper
Add Code

Attribute Group Editing for Reliable Few-shot Image Generation

1 code implementation • CVPR 2022 • Guanqi Ding, Xinzhe Han, Shuhui Wang, Shuzhe Wu, Xin Jin, Dandan Tu, Qingming Huang

Few-shot image generation is a challenging task even using the state-of-the-art Generative Adversarial Networks (GANs).

Attribute Dictionary Learning +1

Paper
Code

Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment

2 code implementations • CVPR 2022 • Jiayu Xiao, Liang Li, Chaofei Wang, Zheng-Jun Zha, Qingming Huang

A feasible solution is to start with a GAN well-trained on a large scale source domain and adapt it to the target domain with a few samples, termed as few shot generative model adaption.

Generative Adversarial Network

Paper
Code

General Greedy De-bias Learning

1 code implementation • 20 Dec 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.

Image Classification Question Answering +1

Paper
Code

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

no code implementations • NeurIPS 2021 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang

To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.

Paper
Add Code

Hierarchical Modular Network for Video Captioning

1 code implementation • CVPR 2022 • Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, Ming-Hsuan Yang

(II) Predicate level, which learns the actions conditioned on highlighted objects and is supervised by the predicate in captions.

Representation Learning Sentence +1

Paper
Code

Self-Regulated Learning for Egocentric Video Activity Anticipation

1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian

Future activity anticipation is a challenging problem in egocentric vision.

Multi-Task Learning

Paper
Code

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang

Based on TDC, we propose the temporal dynamic concept modeling network (TDCMN) to learn an accurate and complete concept representation for efficient untrimmed video analysis.

Image Categorization

Paper
Code

DVCFlow: Modeling Information Flow Towards Human-like Video Captioning

no code implementations • 19 Nov 2021 • Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian

Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.

Dense Video Captioning Sentence

Paper
Add Code

Implicit Feedbacks are Not Always Favorable: Iterative Relabeled One-Class Collaborative Filtering against Noisy Interactions

1 code implementation • ACM MM 2021 2021 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

As the core of the framework, the iterative relabeling module exploits the self-training principle to dynamically generate pseudo labels for user preferences.

Collaborative Filtering Recommendation Systems

Paper
Code

Learning Meta-path-aware Embeddings for Recommender Systems

1 code implementation • ACM MM 2021 2021 • Qianxiu Hao, Qianqian Xu, Zhiyong Yang, Qingming Huang

Heterogeneous information networks (HINs) have become a popular tool to capture complicated user-item relationships in recommendation problems in recent years.

Recommendation Systems

Paper
Code

Semi-Autoregressive Image Captioning

1 code implementation • 11 Oct 2021 • Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian

Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.

Decoder Image Captioning +1

Paper
Code

Pareto Optimality for Fairness-constrained Collaborative Filtering

2 code implementations • MM '21: Proceedings of the 29th ACM International Conference on Multimedia 2021 • Qianxiu Hao, Qianqian Xu, Zhiyong Yang, Qingming Huang

To balance overall recommendation performance and fairness, prevalent solutions apply fairness constraints or regularizations to enforce equality of certain performance across different subgroups.

Collaborative Filtering Fairness

Paper
Code

Edge-featured Graph Neural Architecture Search

no code implementations • 3 Sep 2021 • Shaofei Cai, Liang Li, Xinzhe Han, Zheng-Jun Zha, Qingming Huang

Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges.

Neural Architecture Search

Paper
Add Code

Learning with Multiclass AUC: Theory and Algorithms

no code implementations • TPAMI 2021 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Xiaochun Cao, Qingming Huang

Our foundation is based on the M metric, which is a well-known multiclass extension of AUC.

Recommendation Systems

Paper
Add Code

Greedy Gradient Ensemble for Robust Visual Question Answering

1 code implementation • ICCV 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.

Ranked #2 on Visual Question Answering (VQA) on VQA-CP

Question Answering Visual Question Answering

Paper
Code

When All We Need is a Piece of the Pie: A Generic Framework for Optimizing Two-way Partial AUC.

1 code implementation • ICML 2021 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang

The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss.

Paper
Code

Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

1 code implementation • 13 Jul 2021 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.

Domain Adaptation

262

Paper
Code

Poisoning Attack against Estimating from Pairwise Comparisons

1 code implementation • 5 Jul 2021 • Ke Ma, Qianqian Xu, Jinshan Zeng, Xiaochun Cao, Qingming Huang

In this paper, to the best of our knowledge, we initiate the first systematic investigation of data poisoning attacks on pairwise ranking algorithms, which can be formalized as the dynamic and static games between the ranker and the attacker and can be modeled as certain kinds of integer programming problems.

Data Poisoning

Paper
Code

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

no code implementations • NeurIPS 2021 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang

To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.

Paper
Add Code

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

Ranked #56 on Object Detection on COCO test-dev

2D Human Pose Estimation Instance Segmentation +5

154

Paper
Code

Rethinking Graph Neural Architecture Search from Message-passing

1 code implementation • CVPR 2021 • Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang

Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space.

feature selection Neural Architecture Search

Paper
Code

Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association

1 code implementation • CVPR 2021 • Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang

Targeting at (a), we propose a two-level modality alignment loss where both global and local information are considered.

Retrieval

Paper
Code

Viewpoint and Scale Consistency Reinforcement for UAV Vehicle Re-Identification

1 code implementation • IJCV 2021 • Shangzhi Teng, Shiliang Zhang, Qingming Huang, Nicu Sebe

Moreover, our method also achieves competitive performance compared with recent works on existing vehicle ReID datasets including VehicleID, VeRi-776 and VERI-Wild.

Vehicle Re-Identification

Paper
Code

Exploiting Sample Correlation for Crowd Counting With Multi-Expert Network

no code implementations • ICCV 2021 • Xinyan Liu, Guorong Li, Zhenjun Han, Weigang Zhang, Yifan Yang, Qingming Huang, Nicu Sebe

Specifically, we propose a task-driven similarity metric based on sample's mutual enhancement, referred as co-fine-tune similarity, which can find a more efficient subset of data for training the expert network.

Crowd Counting

Paper
Add Code

Heuristic Domain Adaptation

1 code implementation • NeurIPS 2020 • Shuhao Cui, Xuan Jin, Shuhui Wang, Yuan He, Qingming Huang

In visual domain adaptation (DA), separating the domain-specific characteristics from the domain-invariant representations is an ill-posed problem.

Domain Adaptation

Paper
Code

Semantic Editing On Segmentation Map Via Multi-Expansion Loss

no code implementations • 16 Oct 2020 • Jianfeng He, Xuchao Zhang, Shuo Lei, Shuhui Wang, Qingming Huang, Chang-Tien Lu, Bei Xiao

Each MEx area has the mask area of the generation as the majority and the boundary of original context as the minority.

Image Inpainting Segmentation

Paper
Add Code

Label Decoupling Framework for Salient Object Detection

1 code implementation • CVPR 2020 • Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian

Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.

Ranked #1 on Saliency Detection on HKU-IS

Object object-detection +3

115

Paper
Code

Corner Proposal Network for Anchor-free, Two-stage Object Detection

1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.

Ranked #94 on Object Detection on COCO test-dev

Computational Efficiency Object +3

193

Paper
Code

Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction

no code implementations • 29 Apr 2020 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

To this end, we propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL).

Attribute Multi-Task Learning

Paper
Add Code

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

1 code implementation • CVPR 2020 • Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zheng-Jun Zha, Xingyu Gao, Shuhui Wang, Qingming Huang

Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario.

Vehicle Re-Identification

105

Paper
Code

State-Relabeling Adversarial Active Learning

1 code implementation • CVPR 2020 • Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang

In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.

Active Learning

Paper
Code

Gradually Vanishing Bridge for Adversarial Domain Adaptation

2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.

Unsupervised Domain Adaptation

331

Paper
Code

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.

Domain Adaptation

331

Paper
Code

DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

1 code implementation • 19 Mar 2020 • Zuyao Chen, Runmin Cong, Qianqian Xu, Qingming Huang

There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map.

Ranked #22 on Thermal Image Segmentation on RGB-T-Glass-Segmentation

object-detection RGB-D Salient Object Detection +3

Paper
Code

Global Context-Aware Progressive Aggregation Network for Salient Object Detection

2 code implementations • 2 Mar 2020 • Zuyao Chen, Qianqian Xu, Runmin Cong, Qingming Huang

Deep convolutional neural networks have achieved competitive performance in salient object detection, in which how to learn effective and comprehensive features plays a critical role.

Ranked #20 on Dichotomous Image Segmentation on DIS-TE1

Dichotomous Image Segmentation object-detection +1

111

Paper
Code

Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer

1 code implementation • NeurIPS 2019 • Zhiyong Yang, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang

Different from most of the previous work, pursuing the Block-Diagonal structure of LTAM (assigning latent tasks to output tasks) alleviates negative transfer via collaboratively grouping latent tasks and output tasks such that inter-group knowledge transfer and sharing is suppressed.

Attribute Multi-Task Learning

Paper
Code

DM2C: Deep Mixed-Modal Clustering

1 code implementation • NeurIPS 2019 • Yangbangyan Jiang, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

Instead of transforming all the samples into a joint modality-independent space, our framework learns the mappings across individual modal spaces by virtue of cycle-consistency.

Clustering

Paper
Code

F3Net: Fusion, Feedback and Focus for Salient Object Detection

4 code implementations • 26 Nov 2019 • Jun Wei, Shuhui Wang, Qingming Huang

Furthermore, different from binary cross entropy, the proposed PPA loss doesn't treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details.

Ranked #5 on Salient Object Detection on DUT-OMRON

Dichotomous Image Segmentation Object +2

219

Paper
Code

Collaborative Preference Embedding against Sparse Labels

1 code implementation • ACM MM 2019 • Shilong Bao, Qianqian Xu, Ke Ma, Zhiyong Yang, Xiaochun Cao, Qingming Huang

From the margin theory point-of-view, we then propose a generalization enhancement scheme for sparse and insufficient labels via optimizing the margin distribution.

Collaborative Filtering Decision Making +3

Paper
Code

iSplit LBI: Individualized Partial Ranking with Ties via Split LBI

1 code implementation • NeurIPS 2019 • Qianqian Xu, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper, instead of learning a global ranking which is agreed with the consensus, we pursue the tie-aware partial ranking from an individualized perspective.

Paper
Code

Learning fragment self-attention embeddings for image-text matching

1 code implementation • ACMMM 2019 • Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang

In this paper, we propose Self-Attention Embeddings (SAEM) to exploit fragment relations in images or texts by self-attention mechanism, and aggregate fragment information into visual and textual embeddings.

Image-text matching Sentence +1

Paper
Code

Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • 5 Sep 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.

Object Referring Expression +2

Paper
Code

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation • ICCV 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang

It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.

Attribute Referring Expression +1

Paper
Code

Harmonized Multimodal Learning with Gaussian Process Latent Variable Models

1 code implementation • 14 Aug 2019 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Multimodal learning aims to discover the relationship between multiple modalities.

Cross-Modal Retrieval Retrieval

Paper
Code

Learning Personalized Attribute Preference via Multi-task AUC Optimization

no code implementations • 18 Jun 2019 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

Traditionally, most of the existing attribute learning methods are trained based on the consensus of annotations aggregated from a limited number of annotators.

Attribute

Paper
Add Code

Multimodal Transformer with Multi-View Visual Representation for Image Captioning

no code implementations • 20 May 2019 • Jun Yu, Jing Li, Zhou Yu, Qingming Huang

Despite the success of existing studies, current methods only model the co-attention that characterizes the inter-modal interactions while neglecting the self-attention that characterizes the intra-modal interactions.

Decoder Image Captioning +2

Paper
Add Code

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

1 code implementation • CVPR 2019 • Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang

We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories.

Classification General Classification

Paper
Code

Cascaded Partial Decoder for Fast and Accurate Salient Object Detection

1 code implementation • CVPR 2019 • Zhe Wu, Li Su, Qingming Huang

In this paper, we propose a novel Cascaded Partial Decoder (CPD) framework for fast and accurate salient object detection.

Ranked #1 on RGB Salient Object Detection on ISTD

Camouflaged Object Segmentation Decoder +3

275

Paper
Code

CenterNet: Keypoint Triplets for Object Detection

20 code implementations • ICCV 2019 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.

Ranked #116 on Object Detection on COCO test-dev

Object object-detection +1

1,853

Paper
Code

Spatiotemporal CNN for Video Object Segmentation

1 code implementation • CVPR 2019 • Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang

Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation.

Ranked #2 on Semi-Supervised Video Object Segmentation on YouTube

Object Segmentation +5

148

Paper
Code

Deep Robust Subjective Visual Property Prediction in Crowdsourcing

no code implementations • CVPR 2019 • Qianqian Xu, Zhiyong Yang, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang, Yuan YAO

The problem of estimating subjective visual properties (SVP) of images (e. g., Shoes A is more comfortable than B) is gaining rising attention.

Property Prediction

Paper
Add Code

See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification

4 code implementations • 26 Jan 2019 • Tao Hu, Honggang Qi, Qingming Huang, Yan Lu

Specifically, for each training image, we first generate attention maps to represent the object's discriminative parts by weakly supervised learning.

Ranked #12 on Fine-Grained Image Classification on CUB-200-2011

Data Augmentation Fine-Grained Image Classification +3

164

Paper
Code

HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images

no code implementations • 16 Nov 2018 • Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Nam Ling

In this paper, we propose a novel co-saliency detection method for RGBD images based on hierarchical sparsity reconstruction and energy function refinement.

Co-Salient Object Detection

Paper
Add Code

Person Re-Identification by Semantic Region Representation and Topology Constraint

no code implementations • 20 Aug 2018 • Jianjun Lei, Lijie Niu, Huazhu Fu, Bo Peng, Qingming Huang, Chunping Hou

In this paper, we propose a novel person re-identification method, which consists of a reliable representation called Semantic Region Representation (SRR), and an effective metric learning with Mapping Space Topology Constraint (MSTC).

Metric Learning Person Re-Identification

Paper
Add Code

Weakly Supervised Bilinear Attention Network for Fine-Grained Visual Classification

no code implementations • 6 Aug 2018 • Tao Hu, Jizheng Xu, Cong Huang, Honggang Qi, Qingming Huang, Yan Lu

Besides, we propose attention regularization and attention dropout to weakly supervise the generating process of attention maps.

Classification Fine-Grained Image Classification +1

Paper
Add Code

A Margin-based MLE for Crowdsourced Partial Ranking

no code implementations • 29 Jul 2018 • Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order.

Paper
Add Code

RAM: A Region-Aware Deep Model for Vehicle Re-Identification

no code implementations • 25 Jun 2018 • Xiaobin Liu, Shiliang Zhang, Qingming Huang, Wen Gao

Specifically, in addition to extracting global features, RAM also extracts features from a series of local regions.

Vehicle Re-Identification

Paper
Add Code

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

no code implementations • ECCV 2018 • Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, Qi Tian

Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e. g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.

Ranked #5 on Object Detection on UAVDT

Multiple Object Tracking Object +3

Paper
Add Code

Facial Landmarks Detection by Self-Iterative Regression based Landmarks-Attention Network

no code implementations • 18 Mar 2018 • Tao Hu, Honggang Qi, Jizheng Xu, Qingming Huang

Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor.

Ranked #16 on Face Alignment on 300W (NME_inter-pupil (%, Common) metric)

Face Alignment regression

Paper
Add Code

Review of Visual Saliency Detection with Comprehensive Information

no code implementations • 9 Mar 2018 • Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, Qingming Huang

With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection.

Co-Salient Object Detection Video Saliency Detection

Paper
Add Code

From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation

no code implementations • 8 Mar 2018 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments.

Paper
Add Code

Less Is More: Picking Informative Frames for Video Captioning

no code implementations • ECCV 2018 • Yangyu Chen, Shuhui Wang, Weigang Zhang, Qingming Huang

We propose a plug-and-play PickNet to perform informative frame picking in video captioning.

Decoder Video Captioning

Paper
Add Code

Towards Realistic Face Photo-Sketch Synthesis via Composition-Aided GANs

2 code implementations • 4 Dec 2017 • Jun Yu, Xingxin Xu, Fei Gao, Shengjie Shi, Meng Wang, DaCheng Tao, Qingming Huang

Experimental results show that our method is capable of generating both visually comfortable and identity-preserving face sketches/photos over a wide range of challenging data.

Ranked #1 on Face Sketch Synthesis on CUFS (FID metric)

Face Sketch Synthesis Generative Adversarial Network

Paper
Code

From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions

no code implementations • 18 Nov 2017 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

However, both categories ignore the joint effect of the two mentioned factors: the personal diversity with respect to the global consensus; and the intrinsic correlation among multiple attributes.

Attribute feature selection

Paper
Add Code

HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

no code implementations • 16 Nov 2017 • Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan YAO

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains.

Paper
Add Code

An Iterative Co-Saliency Framework for RGBD Images

no code implementations • 4 Nov 2017 • Runmin Cong, Jianjun Lei, Huazhu Fu, Weisi Lin, Qingming Huang, Xiaochun Cao, Chunping Hou

In this paper, we propose an iterative RGBD co-saliency framework, which utilizes the existing single saliency maps as the initialization, and generates the final RGBD cosaliency map by using a refinement-cycle model.

Co-Salient Object Detection

Paper
Add Code

Saliency Detection for Stereoscopic Images Based on Depth Confidence Analysis and Multiple Cues Fusion

no code implementations • 14 Oct 2017 • Runmin Cong, Jianjun Lei, Changqing Zhang, Qingming Huang, Xiaochun Cao, Chunping Hou

Stereoscopic perception is an important part of human visual system that allows the brain to perceive depth.

graph construction Saliency Detection

Paper
Add Code

Co-saliency Detection for RGBD Images Based on Multi-constraint Feature Matching and Cross Label Propagation

no code implementations • 14 Oct 2017 • Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Chunping Hou

Different from the most existing co-saliency methods focusing on RGB images, this paper proposes a novel co-saliency detection model for RGBD images, which utilizes the depth information to enhance identification of co-saliency.

Co-Salient Object Detection

Paper
Add Code

Multimodal Gaussian Process Latent Variable Models With Harmonization

no code implementations • ICCV 2017 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

We incorporate the harmonization mechanism into the learning process of multimodal GPLVMs.

Cross-Modal Retrieval Retrieval

Paper
Add Code

Exploring Outliers in Crowdsourced Ranking for QoE

no code implementations • 18 Jul 2017 • Qianqian Xu, Ming Yan, Chendi Huang, Jiechao Xiong, Qingming Huang, Yuan YAO

Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years.

Outlier Detection

Paper
Add Code

Online Asymmetric Similarity Learning for Cross-Modal Retrieval

no code implementations • CVPR 2017 • Yiling Wu, Shuhui Wang, Qingming Huang

In this paper, we propose an online learning method to learn the similarity function between heterogeneous modalities by preserving the relative similarity in the training data, which is modeled as a set of bi-directional hinge loss constraints on the cross-modal training triplets.

Cross-Modal Retrieval Retrieval +2

Paper
Add Code

A Graph Regularized Deep Neural Network for Unsupervised Image Representation Learning

1 code implementation • CVPR 2017 • Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang

Deep Auto-Encoder (DAE) has shown its promising power in high-level representation learning.

Representation Learning

Paper
Code

Hedged Deep Tracking

no code implementations • CVPR 2016 • Yuankai Qi, Shengping Zhang, Lei Qin, Hongxun Yao, Qingming Huang, Jongwoo Lim, Ming-Hsuan Yang

In recent years, several methods have been developed to utilize hierarchical features learned from a deep convolutional neural network (CNN) for visual tracking.

Visual Tracking

Paper
Add Code

Geometric Hypergraph Learning for Visual Tracking

no code implementations • 18 Mar 2016 • Dawei Du, Honggang Qi, Longyin Wen, Qi Tian, Qingming Huang, Siwei Lyu

Graph based representation is widely used in visual tracking field by finding correct correspondences between target parts in consecutive frames.

Visual Tracking

Paper
Add Code

Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks

1 code implementation • 18 Dec 2015 • Li Shen, Zhouchen Lin, Qingming Huang

Learning deeper convolutional neural networks becomes a tendency in recent years.

Ranked #8 on Long-tail Learning on VOC-MLT

General Classification Long-tail Learning +1

Paper
Code

Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis

no code implementations • ICCV 2015 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.

Retrieval

Paper
Add Code

Evaluating Visual Properties via Robust HodgeRank

no code implementations • 15 Aug 2014 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms.

Graph Sampling Outlier Detection

Paper
Add Code

Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization

no code implementations • CVPR 2013 • Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang

For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity.

Dictionary Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.