Search Results for author: Dong Xu

Found 102 papers, 35 papers with code

Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis

no code implementations • COLING 2022 • Xueyuan Chen, Shun Lei, Zhiyong Wu, Dong Xu, Weifeng Zhao, Helen Meng

On top of these, a bi-reference attention mechanism is used to align both local-scale reference style embedding sequence and local-scale context style embedding sequence with corresponding phoneme embedding sequence.

Speech Synthesis

Paper
Add Code

Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video Compression

no code implementations • 7 May 2024 • Zhenghao Chen, Luping Zhou, Zhihao Hu, Dong Xu

Content-adaptive compression is crucial for enhancing the adaptability of the pre-trained neural codec for various contents.

Image Compression Video Compression

Paper
Add Code

RaFE: Generative Radiance Fields Restoration

no code implementations • 4 Apr 2024 • Zhongkai Wu, Ziyu Wan, Jing Zhang, Jing Liao, Dong Xu

Instead of reconstructing a blurred NeRF by averaging inconsistencies, we introduce a novel approach using Generative Adversarial Networks (GANs) for NeRF generation to better accommodate the geometric and appearance inconsistencies present in the multi-view images.

3D Reconstruction Novel View Synthesis

Paper
Add Code

Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review

no code implementations • 22 Mar 2024 • Jinge Wang, Zien Cheng, Qiuming Yao, Li Liu, Dong Xu, Gangqing Hu

The year 2023 marked a significant surge in the exploration of applying large language model (LLM) chatbots, notably ChatGPT, across various disciplines.

Chatbot Drug Discovery +2

Paper
Add Code

AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style Assistant

no code implementations • 4 Feb 2024 • Jincao Yao, Yunpeng Wang, Zhikai Lei, Kai Wang, Xiaoxian Li, Jianhua Zhou, Xiang Hao, Jiafei Shen, Zhenping Wang, Rongrong Ru, Yaqing Chen, Yahan Zhou, Chen Chen, YanMing Zhang, Ping Liang, Dong Xu

After training, ThyGPT could automatically evaluate thyroid nodule and engage in effective communication with physicians through human-computer interaction.

Specificity

Paper
Add Code

Data-Free Generalized Zero-Shot Learning

no code implementations • 28 Jan 2024 • Bowen Tang, Long Yan, Jing Zhang, Qian Yu, Lu Sheng, Dong Xu

Firstly, to recover the virtual features of the base data, we model the CLIP features of base class images as samples from a von Mises-Fisher (vMF) distribution based on the pre-trained classifier.

Generalized Zero-Shot Learning Zero-shot Generalization

Paper
Add Code

Harnessing Neuron Stability to Improve DNN Verification

2 code implementations • 19 Jan 2024 • Hai Duong, Dong Xu, ThanhVu Nguyen, Matthew B. Dwyer

We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feedforward networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets) applied to the standard MNIST and CIFAR datasets.

Paper
Code

Multi-modality Affinity Inference for Weakly Supervised 3D Semantic Segmentation

1 code implementation • 27 Dec 2023 • Xiawei Li, Qingyuan Xu, Jing Zhang, Tianyi Zhang, Qian Yu, Lu Sheng, Dong Xu

The point affinity proposed in this paper is characterized by features from multiple modalities (e. g., point cloud and RGB), and is further refined by normalizing the classifier weights to alleviate the detrimental effects of long-tailed distribution without the need of the prior of category distribution.

3D Semantic Segmentation Point Cloud Segmentation +1

Paper
Code

SVGDreamer: Text Guided SVG Generation with Diffusion Model

1 code implementation • 27 Dec 2023 • XiMing Xing, Haitao Zhou, Chuang Wang, Jing Zhang, Dong Xu, Qian Yu

However, existing text-to-SVG generation methods lack editability and struggle with visual quality and result diversity.

Vector Graphics

103

Paper
Code

A Video is Worth 256 Bases: Spatial-Temporal Expectation-Maximization Inversion for Zero-Shot Video Editing

no code implementations • 10 Dec 2023 • Maomao Li, Yu Li, Tianyu Yang, Yunfei Liu, Dongxu Yue, Zhihui Lin, Dong Xu

This paper presents a video inversion approach for zero-shot video editing, which aims to model the input video with low-rank representation during the inversion process.

Video Editing

Paper
Add Code

UFDA: Universal Federated Domain Adaptation with Practical Assumptions

no code implementations • 27 Nov 2023 • Xinhui Liu, Zhenghao Chen, Luping Zhou, Dong Xu, Wei Xi, Gairui Bai, Yihan Zhao, Jizhong Zhao

Conventional Federated Domain Adaptation (FDA) approaches usually demand an abundance of assumptions, which makes them significantly less feasible for real-world situations and introduces security hazards.

Domain Adaptation

Paper
Add Code

Progressive Target-Styled Feature Augmentation for Unsupervised Domain Adaptation on Point Clouds

1 code implementation • 27 Nov 2023 • Zicheng Wang, Zhen Zhao, Yiming Wu, Luping Zhou, Dong Xu

Unlike previous works that focus on feature extractor adaptation, our PTSFA approach focuses on classifier adaptation.

Self-Supervised Learning Unsupervised Domain Adaptation

Paper
Code

Adapting Segment Anything Model (SAM) through Prompt-based Learning for Enhanced Protein Identification in Cryo-EM Micrographs

1 code implementation • 4 Nov 2023 • Fei He, Zhiyuan Yang, Mingyue Gao, Biplab Poudel, Newgin Sam Ebin Sam Dhas, Rajan Gyawali, Ashwin Dhakal, Jianlin Cheng, Dong Xu

Cryo-electron microscopy (cryo-EM) remains pivotal in structural biology, yet the task of protein particle picking, integral for 3D protein structure construction, is laden with manual inefficiencies.

Image Segmentation object-detection +2

Paper
Code

E4S: Fine-grained Face Swapping via Editing With Regional GAN Inversion

2 code implementations • 23 Oct 2023 • Maomao Li, Ge Yuan, Cairong Wang, Zhian Liu, Yong Zhang, Yongwei Nie, Jue Wang, Dong Xu

Based on this disentanglement, face swapping can be simplified as style and mask swapping.

Disentanglement Face Swapping +3

109

Paper
Code

Diffusion Model is Secretly a Training-free Open Vocabulary Semantic Segmenter

1 code implementation • 6 Sep 2023 • Jinglong Wang, Xiawei Li, Jing Zhang, Qingyuan Xu, Qin Zhou, Qian Yu, Lu Sheng, Dong Xu

The pre-trained text-image discriminative models, such as CLIP, has been explored for open-vocabulary semantic segmentation with unsatisfactory results due to the loss of crucial localization information and awareness of object shapes.

Contrastive Learning Denoising +5

Paper
Code

Inversion-by-Inversion: Exemplar-based Sketch-to-Photo Synthesis via Stochastic Differential Equations without Training

1 code implementation • 15 Aug 2023 • XiMing Xing, Chuang Wang, Haitao Zhou, Zhihao Hu, Chongxuan Li, Dong Xu, Qian Yu

In the full-control inversion process, we propose an appearance-energy function to control the color and texture of the final generated photo. Importantly, our Inversion-by-Inversion pipeline is training-free and can accept different types of exemplars for color and texture control.

Image Generation

Paper
Code

Distortion-aware Transformer in 360° Salient Object Detection

1 code implementation • 7 Aug 2023 • Yinjie Zhao, Lichen Zhao, Qian Yu, Jing Zhang, Lu Sheng, Dong Xu

The first is a Distortion Mapping Module, which guides the model to pre-adapt to distorted features globally.

ERP Object +3

Paper
Code

Reinforcement Learning-based Non-Autoregressive Solver for Traveling Salesman Problems

1 code implementation • 1 Aug 2023 • Yubin Xiao, Di Wang, Boyang Li, Huanhuan Chen, Wei Pang, Xuan Wu, Hao Li, Dong Xu, Yanchun Liang, You Zhou

The Traveling Salesman Problem (TSP) is a well-known combinatorial optimization problem with broad real-world applications.

Combinatorial Optimization reinforcement-learning +2

Paper
Code

VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNet

no code implementations • 26 Jul 2023 • Zhihao Hu, Dong Xu

In this work, by using the diffusion model with ControlNet, we proposed a new motion-guided video-to-video translation framework called VideoControlNet to generate various videos based on the given prompts and the condition from the input video.

Image Generation

Paper
Add Code

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

1 code implementation • NeurIPS 2023 • XiMing Xing, Chuang Wang, Haitao Zhou, Jing Zhang, Qian Yu, Dong Xu

Even though trained mainly on images, we discover that pretrained diffusion models show impressive power in guiding sketch synthesis.

200

Paper
Code

Bid Optimization for Offsite Display Ad Campaigns on eCommerce

no code implementations • 18 Jun 2023 • Hangjian Li, Dong Xu, Konstantin Shmakov, Kuang-Chih Lee, Wei Shen

Online retailers often use third-party demand-side-platforms (DSPs) to conduct offsite advertising and reach shoppers across the Internet on behalf of their advertisers.

Paper
Add Code

Association of stroke lesion distributions with atrial fibrillation detected after stroke

no code implementations • 24 May 2023 • Yiming Chen, Sihui Wang, Dong Xu

The knowledge of the unique characteristics of the population with atrial fibrillation detected after stroke (AFDAS) enables more ischemic stroke patients to benefit from more aggressive anticoagulation therapy and AF management.

Paper
Add Code

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder

no code implementations • 4 Apr 2023 • Yunyi Liu, Zhanyu Wang, Dong Xu, Luping Zhou

To bridge this gap, in this paper, we propose a new Transformer based framework for medical VQA (named as Q2ATransformer), which integrates the advantages of both the classification and the generation approaches and provides a unified treatment for the close-end and open-end questions.

Classification Decoder +3

Paper
Add Code

VL-SAT: Visual-Linguistic Semantics Assisted Training for 3D Semantic Scene Graph Prediction in Point Cloud

1 code implementation • CVPR 2023 • Ziqin Wang, Bowen Cheng, Lichen Zhao, Dong Xu, Yang Tang, Lu Sheng

Since 2D images provide rich semantics and scene graphs are in nature coped with languages, in this study, we propose Visual-Linguistic Semantics Assisted Training (VL-SAT) scheme that can significantly empower 3DSSG prediction models with discrimination about long-tailed and ambiguous semantic relations.

Ranked #1 on 3d scene graph generation on 3DSSG (using extra training data)

3d scene graph generation Relation

Paper
Code

Large AI Models in Health Informatics: Applications, Challenges, and the Future

1 code implementation • 21 Mar 2023 • Jianing Qiu, Lin Li, Jiankai Sun, Jiachuan Peng, Peilun Shi, Ruiyang Zhang, Yinzhao Dong, Kyle Lam, Frank P. -W. Lo, Bo Xiao, Wu Yuan, Ningli Wang, Dong Xu, Benny Lo

Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions.

Decision Making Drug Discovery +1

338

Paper
Code

Conflict-Based Cross-View Consistency for Semi-Supervised Semantic Segmentation

1 code implementation • CVPR 2023 • Zicheng Wang, Zhen Zhao, Xiaoxia Xing, Dong Xu, Xiangyu Kong, Luping Zhou

In this work, we propose a new conflict-based cross-view consistency (CCVC) method based on a two-branch co-training framework which aims at enforcing the two sub-nets to learn informative features from irrelevant views.

Semi-Supervised Semantic Segmentation

Paper
Code

Bridging Synthetic and Real Images: a Transferable and Multiple Consistency aided Fundus Image Enhancement Framework

no code implementations • 23 Feb 2023 • Erjian Guo, Huazhu Fu, Luping Zhou, Dong Xu

Moreover, we also propose a novel multi-stage multi-attention guided enhancement network (MAGE-Net) as the backbones of our teacher and student network.

Domain Adaptation Image Enhancement

Paper
Add Code

Diffusion Models in Bioinformatics: A New Wave of Deep Learning Revolution in Action

no code implementations • 13 Feb 2023 • Zhiye Guo, Jian Liu, Yanli Wang, Mengrui Chen, Duolin Wang, Dong Xu, Jianlin Cheng

This review aims to provide a rather thorough overview of the applications of diffusion models in bioinformatics to aid their further development in bioinformatics and computational biology.

Denoising Protein Design

Paper
Add Code

Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression

no code implementations • CVPR 2023 • Zhihao Hu, Dong Xu

In this work, we propose the complexity-guided slimmable decoder (cgSlimDecoder) in combination with skip-adaptive entropy coding (SaEC) for efficient deep video compression.

Decoder Motion Compensation +1

Paper
Add Code

Content Adaptive Latents and Decoder for Neural Image Compression

no code implementations • 20 Dec 2022 • Guanbo Pan, Guo Lu, Zhihao Hu, Dong Xu

Although several content adaptive methods have been proposed by updating the encoder-side components, the adaptability of both latents and the decoder is not well exploited.

Decoder Image Compression

Paper
Add Code

Slow Motion Matters: A Slow Motion Enhanced Network for Weakly Supervised Temporal Action Localization

no code implementations • 21 Nov 2022 • Weiqi Sun, Rui Su, Qian Yu, Dong Xu

Weakly supervised temporal action localization (WTAL) aims to localize actions in untrimmed videos with only weak supervision information (e. g. video-level labels).

Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization

Paper
Add Code

Towards Explainable 3D Grounded Visual Question Answering: A New Benchmark and Strong Baseline

1 code implementation • 24 Sep 2022 • Lichen Zhao, Daigang Cai, Jing Zhang, Lu Sheng, Dong Xu, Rui Zheng, Yinjie Zhao, Lipeng Wang, Xibo Fan

We also propose a new 3D VQA framework to effectively predict the completely visually grounded and explainable answer.

Question Answering Visual Question Answering

Paper
Code

Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

1 code implementation • 31 Aug 2022 • ZiMing Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu

In addition to previous methods that seek correspondences by hand-crafted or learnt geometric features, recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence.

Point Cloud Registration

Paper
Code

SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling

1 code implementation • 14 Aug 2022 • Chenjian Gao, Qian Yu, Lu Sheng, Yi-Zhe Song, Dong Xu

Reconstructing a 3D shape based on a single sketch image is challenging due to the large domain gap between a sparse, irregular sketch and a regular, dense 3D shape.

3D Reconstruction

Paper
Code

Coarse-to-fine Deep Video Coding with Hyperprior-guided Mode Prediction

no code implementations • CVPR 2022 • Zhihao Hu, Guo Lu, Jinyang Guo, Shan Liu, Wei Jiang, Dong Xu

The previous deep video compression approaches only use the single scale motion compensation strategy and rarely adopt the mode prediction technique from the traditional standards like H. 264/H. 265 for both motion and residual compression.

Motion Compensation Motion Estimation +1

Paper
Add Code

Revisiting Deep Semi-supervised Learning: An Empirical Distribution Alignment Framework and Its Generalization Bound

no code implementations • 13 Mar 2022 • Feiyu Wang, Qin Wang, Wen Li, Dong Xu, Luc van Gool

Benefited from this new perspective, we first propose a new deep semi-supervised learning framework called Semi-supervised Learning by Empirical Distribution Alignment (SLEDA), in which existing technologies from the domain adaptation community can be readily used to address the semi-supervised learning problem through reducing the empirical distribution distance between labeled and unlabeled data.

Data Augmentation Domain Adaptation

Paper
Add Code

LSVC: A Learning-Based Stereo Video Compression Framework

no code implementations • CVPR 2022 • Zhenghao Chen, Guo Lu, Zhihao Hu, Shan Liu, Wei Jiang, Dong Xu

In this work, we propose the first end-to-end optimized framework for compressing automotive stereo videos (i. e., stereo videos from autonomous driving applications) from both left and right views.

Autonomous Driving Motion Compensation +1

Paper
Add Code

3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds

no code implementations • CVPR 2022 • Daigang Cai, Lichen Zhao, Jing Zhang, Lu Sheng, Dong Xu

Observing that the 3D captioning task and the 3D grounding task contain both shared and complementary information in nature, in this work, we propose a unified framework to jointly solve these two distinct but closely related tasks in a synergistic fashion, which consists of both shared task-agnostic modules and lightweight task-specific modules.

Attribute Dense Captioning +1

Paper
Add Code

Learning Based Multi-Modality Image and Video Compression

no code implementations • CVPR 2022 • Guo Lu, Tianxiong Zhong, Jing Geng, Qiang Hu, Dong Xu

Specifically, given the image in the reference modality (e. g., the infrared image), we use the channel-wise alignment module to produce the aligned features based on the affine transform.

Data Compression Video Compression

Paper
Add Code

EndHiC: assemble large contigs into chromosomal-level scaffolds using the Hi-C links from contig ends

1 code implementation • 30 Nov 2021 • Sen Wang, Hengchao Wang, Fan Jiang, Anqi Wang, Hangwei Liu, Hanbo Zhao, Boyuan Yang, Dong Xu, Yan Zhang, Wei Fan

As the Hi-C links of two adjacent contigs concentrate only at the neighbor ends of the contigs, larger contig size will reduce the power to differentiate adjacent (signal) and non-adjacent (noise) contig linkages, leading to a higher rate of mis-assembly.

Paper
Code

SRDAN: Scale-Aware and Range-Aware Domain Adaptation Network for Cross-Dataset 3D Object Detection

1 code implementation • CVPR 2021 • Weichen Zhang, Wen Li, Dong Xu

In this work, we propose a new cross-dataset 3D object detection method named Scale-aware and Range-aware Domain Adaptation Network (SRDAN).

3D Object Detection Domain Adaptation +2

Paper
Code

NG+ : A Multi-Step Matrix-Product Natural Gradient Method for Deep Learning

1 code implementation • 14 Jun 2021 • MingHan Yang, Dong Xu, Qiwen Cui, Zaiwen Wen, Pengxiang Xu

In this paper, a novel second-order method called NG+ is proposed.

Image Classification Machine Translation +1

Paper
Code

CBANet: Towards Complexity and Bitrate Adaptive Deep Image Compression using a Single Network

no code implementations • 26 May 2021 • Jinyang Guo, Dong Xu, Guo Lu

Furthermore, to achieve variable bitrate decoding with one single decoder, we propose a bitrate adaptive module to project the representation from a base bitrate to the expected representation at a target bitrate for transmission.

Decoder Image Compression

Paper
Add Code

FVC: A New Framework towards Deep Video Compression in Feature Space

no code implementations • CVPR 2021 • Zhihao Hu, Guo Lu, Dong Xu

In this work, we propose a feature-space video coding network (FVC) by performing all major operations (i. e., motion estimation, motion compression, motion compensation and residual compression) in the feature space.

Motion Compensation Motion Estimation +1

Paper
Add Code

VoxelContext-Net: An Octree based Framework for Point Cloud Compression

no code implementations • CVPR 2021 • Zizheng Que, Guo Lu, Dong Xu

In this paper, we propose a two-stage deep learning framework called VoxelContext-Net for both static and dynamic point cloud compression.

Decoder

Paper
Add Code

Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

1 code implementation • CVPR 2021 • Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds.

Ranked #17 on 3D Object Detection on ScanNetV2

3D Object Detection Object +1

Paper
Code

VDM-DA: Virtual Domain Modeling for Source Data-free Domain Adaptation

no code implementations • 26 Mar 2021 • Jiayi Tian, Jing Zhang, Wen Li, Dong Xu

On the other hand, we also design an effective distribution alignment method to reduce the distribution divergence between the virtual domain and the target domain by gradually improving the compactness of the target domain distribution through model learning.

Object Recognition Unsupervised Domain Adaptation

Paper
Add Code

Salient Object Detection via Integrity Learning

3 code implementations • 19 Jan 2021 • Mingchen Zhuge, Deng-Ping Fan, Nian Liu, Dingwen Zhang, Dong Xu, Ling Shao

We define the concept of integrity at both a micro and macro level.

Object object-detection +2

264

Paper
Code

Formal Language Constrained Markov Decision Processes

no code implementations • 1 Jan 2021 • Eleanor Quint, Dong Xu, Samuel W Flint, Stephen D Scott, Matthew Dwyer

In order to satisfy safety conditions, an agent may be constrained from acting freely.

Paper
Add Code

3DVG-Transformer: Relation Modeling for Visual Grounding on Point Clouds

no code implementations • ICCV 2021 • Lichen Zhao, Daigang Cai, Lu Sheng, Dong Xu

Visual grounding on 3D point clouds is an emerging vision and language task that benefits various applications in understanding the 3D visual world.

Object Object Proposal Generation +2

Paper
Add Code

StyleFormer: Real-Time Arbitrary Style Transfer via Parametric Style Composition

1 code implementation • ICCV 2021 • Xiaolei Wu, Zhihao Hu, Lu Sheng, Dong Xu

In this work, we propose a new feed-forward arbitrary style transfer method, referred to as StyleFormer, which can simultaneously fulfill fine-grained style diversity and semantic content coherency.

Style Transfer

Paper
Code

STVGBert: A Visual-Linguistic Transformer Based Framework for Spatio-Temporal Video Grounding

no code implementations • ICCV 2021 • Rui Su, Qian Yu, Dong Xu

Spatio-temporal video grounding (STVG) aims to localize a spatio-temporal tube of a target object in an untrimmed video based on a query sentence.

Object Sentence +2

Paper
Add Code

Inception Convolution with Efficient Dilation Search

1 code implementation • CVPR 2021 • Jie Liu, Chuming Li, Feng Liang, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang, Dong Xu

To develop a practical method for learning complex inception convolution based on the data, a simple but effective search algorithm, referred to as efficient dilation optimization (EDO), is developed.

Human Detection Instance Segmentation +4

113

Paper
Code

Human-centric Spatio-Temporal Video Grounding With Visual Transformers

1 code implementation • 10 Nov 2020 • Zongheng Tang, Yue Liao, Si Liu, Guanbin Li, Xiaojie Jin, Hongxu Jiang, Qian Yu, Dong Xu

HC-STVG is a video grounding task that requires both spatial (where) and temporal (when) localization.

Referring Expression Sentence +3

Paper
Code

Deep Learning Analysis and Age Prediction from Shoeprints

1 code implementation • 7 Nov 2020 • Muhammad Hassan, Yan Wang, Di Wang, Daixi Li, Yanchun Liang, You Zhou, Dong Xu

We collected 100, 000 shoeprints of subjects ranging from 7 to 80 years old and used the data to develop a deep learning end-to-end model ShoeNet to analyze age-related patterns and predict age.

Gender Classification

Paper
Code

A Method of Generating Measurable Panoramic Image for Indoor Mobile Measurement System

no code implementations • 27 Oct 2020 • Hao Ma, Jingbin Liu, Zhirong Hu, Hongyu Qiu, Dong Xu, Zemin Wang, Xiaodong Gong, Sheng Yang

This paper designs a technique route to generate high-quality panoramic image with depth information, which involves two critical research hotspots: fusion of LiDAR and image data and image stitching.

Image Stitching

Paper
Add Code

A Simple and Efficient Registration of 3D Point Cloud and Image Data for Indoor Mobile Mapping System

no code implementations • 27 Oct 2020 • Hao Ma, Jingbin Liu, Keke Liu, Hongyu Qiu, Dong Xu, Zemin Wang, Xiaodong Gong, Sheng Yang

Registration of 3D LiDAR point clouds with optical images is critical in the combination of multi-source data.

Edge Detection

Paper
Add Code

Improving Deep Video Compression by Resolution-adaptive Flow Coding

no code implementations • ECCV 2020 • Zhihao Hu, Zhenghao Chen, Dong Xu, Guo Lu, Wanli Ouyang, Shuhang Gu

In this work, we propose a new framework called Resolution-adaptive Flow Coding (RaFC) to effectively compress the flow maps globally and locally, in which we use multi-resolution representations instead of single-resolution representations for both the input flow maps and the output motion features of the MV encoder.

Optical Flow Estimation Video Compression

Paper
Add Code

Systematic Generation of Diverse Benchmarks for DNN Verification

1 code implementation • 14 Jul 2020 • Dong Xu, David Shriver, Matthew B. Dwyer, Sebastian Elbaum

The field of verification has advanced due to the interplay of theoretical development and empirical evaluation.

Paper
Code

Simulating multi-exit evacuation using deep reinforcement learning

no code implementations • 11 Jul 2020 • Dong Xu, Xiao Huang, Joseph Mango, Xiang Li, Zhenlong Li

We propose a multi-exit evacuation simulation based on Deep Reinforcement Learning (DRL), referred to as the MultiExit-DRL, which involves in a Deep Neural Network (DNN) framework to facilitate state-to-action mapping.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Enhance Curvature Information by Structured Stochastic Quasi-Newton Methods

no code implementations • CVPR 2021 • Ming-Han Yang, Dong Xu, Hongyu Chen, Zaiwen Wen, Mengyun Chen

In this paper, we consider stochastic second-order methods for minimizing a finite summation of nonconvex functions.

Second-order methods

Paper
Add Code

Sketchy Empirical Natural Gradient Methods for Deep Learning

1 code implementation • 10 Jun 2020 • Ming-Han Yang, Dong Xu, Zaiwen Wen, Mengyun Chen, Pengxiang Xu

Experiments on the distributed large-batch training show that the scaling efficiency is quite reasonable.

Paper
Code

Content Adaptive and Error Propagation Aware Deep Video Compression

no code implementations • ECCV 2020 • Guo Lu, Chunlei Cai, Xiaoyun Zhang, Li Chen, Wanli Ouyang, Dong Xu, Zhiyong Gao

Therefore, the encoder is adaptive to different video contents and achieves better compression performance by reducing the domain gap between the training and testing datasets.

Decoder Video Compression

Paper
Add Code

Channel Pruning Guided by Classification Loss and Feature Importance

no code implementations • 15 Mar 2020 • Jinyang Guo, Wanli Ouyang, Dong Xu

To this end, we propose a new strategy to suppress the influence of unimportant features (i. e., the features will be removed at the next pruning stage).

Classification Feature Importance +1

Paper
Add Code

A Unified End-to-End Framework for Efficient Deep Image Compression

1 code implementation • 9 Feb 2020 • Jiaheng Liu, Guo Lu, Zhihao Hu, Dong Xu

Our EDIC method can also be readily incorporated with the Deep Video Compression (DVC) framework to further improve the video compression performance.

Decoder Image Compression +1

Paper
Code

Translating multispectral imagery to nighttime imagery via conditional generative adversarial networks

no code implementations • 28 Dec 2019 • Xiao Huang, Dong Xu, Zhenlong Li, Cuizhen Wang

The results of this study prove the possibility of multispectral-to-nighttime translation and further indicate that, with the additional social media data, the generated nighttime imagery can be very similar to the ground-truth imagery.

Translation

Paper
Add Code

Formal Language Constraints for Markov Decision Processes

1 code implementation • 2 Oct 2019 • Eleanor Quint, Dong Xu, Samuel Flint, Stephen Scott, Matthew Dwyer

In order to satisfy safety conditions, an agent may be constrained from acting freely.

Atari Games

Paper
Code

IntersectGAN: Learning Domain Intersection for Generating Images with Multiple Attributes

no code implementations • 21 Sep 2019 • Zehui Yao, Boyan Zhang, Zhiyong Wang, Wanli Ouyang, Dong Xu, Dagan Feng

For example, given two image domains $X_1$ and $X_2$ with certain attributes, the intersection $X_1 \cap X_2$ denotes a new domain where images possess the attributes from both $X_1$ and $X_2$ domains.

Attribute

Paper
Add Code

Refactoring Neural Networks for Verification

2 code implementations • 6 Aug 2019 • David Shriver, Dong Xu, Sebastian Elbaum, Matthew B. Dwyer

Deep neural networks (DNN) are growing in capability and applicability.

Paper
Code

Deep Learning Detection of Inaccurate Smart Electricity Meters: A Case Study

1 code implementation • 26 Jul 2019 • Ming Liu, Dongpeng Liu, Guangyu Sun, Yi Zhao, Duolin Wang, Fangxing Liu, Xiang Fang, Qing He, Dong Xu

Detecting inaccurate smart meters and targeting them for replacement can save significant resources.

Time Series Analysis

Paper
Code

Improving Action Localization by Progressive Cross-stream Cooperation

no code implementations • CVPR 2019 • Rui Su, Wanli Ouyang, Luping Zhou, Dong Xu

Specifically, we first generate a larger set of region proposals by combining the latest region proposals from both streams, from which we can readily obtain a larger set of labelled training samples to help learn better action detection models.

Action Classification Action Detection +2

Paper
Add Code

Gated Group Self-Attention for Answer Selection

no code implementations • 26 May 2019 • Dong Xu, Jianhui Ji, Haikuan Huang, Hongbo Deng, Wu-Jun Li

Nevertheless, it is difficult for RNN based models to capture the information about long-range dependency among words in the sentences of questions and answers.

Answer Selection Machine Translation +1

Paper
Add Code

Hashing based Answer Selection

no code implementations • 26 May 2019 • Dong Xu, Wu-Jun Li

HAS adopts a hashing strategy to learn a binary matrix representation for each answer, which can dramatically reduce the memory cost for storing the matrix representations of answers.

Answer Selection

Paper
Add Code

DVC: An End-to-end Deep Video Compression Framework

4 code implementations • CVPR 2019 • Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Chunlei Cai, Zhiyong Gao

Conventional video compression approaches use the predictive coding architecture and encode the corresponding motion information and residual information.

MS-SSIM Optical Flow Estimation +2

394

Paper
Code

Constraining Action Sequences with Formal Languages for Deep Reinforcement Learning

no code implementations • 27 Sep 2018 • Dong Xu, Eleanor Quint, Zeynep Hakguder, Haluk Dogan, Stephen Scott, Matthew Dwyer

We study the problem of deep reinforcement learning where the agent's action sequences are constrained, e. g., prohibition of dithering or overactuating action sequences that might damage a robot, drone, or other physical device.

Atari Games reinforcement-learning +1

Paper
Add Code

Dividing and Aggregating Network for Multi-view Action Recognition

no code implementations • ECCV 2018 • Dongang Wang, Wanli Ouyang, Wen Li, Dong Xu

We then train view-specific action classifiers based on the view-specific representation for each view and a view classifier based on the shared representation at lower layers.

Action Recognition Temporal Action Localization

Paper
Add Code

Deep Kalman Filtering Network for Video Compression Artifact Reduction

1 code implementation • ECCV 2018 • Guo Lu, Wanli Ouyang, Dong Xu, Xiaoyun Zhang, Zhiyong Gao, Ming-Ting Sun

In this paper, we model the video artifact reduction task as a Kalman filtering procedure and restore decoded frames through a deep Kalman filtering network.

Video Compression

Paper
Code

Collaborative and Adversarial Network for Unsupervised Domain Adaptation

1 code implementation • CVPR 2018 • Weichen Zhang, Wanli Ouyang, Wen Li, Dong Xu

In this paper, we propose a new unsupervised domain adaptation approach called Collaborative and Adversarial Network (CAN) through domain-collaborative and domain-adversarial training of neural networks.

Unsupervised Domain Adaptation

Paper
Code

Complex Event Detection by Identifying Reliable Shots From Untrimmed Videos

no code implementations • ICCV 2017 • Hehe Fan, Xiaojun Chang, De Cheng, Yi Yang, Dong Xu, Alexander G. Hauptmann

relevant) to the given event class, we formulate this task as a multi-instance learning (MIL) problem by taking each video as a bag and the video shots in each video as instances.

Event Detection

Paper
Add Code

MUFold-SS: Protein Secondary Structure Prediction Using Deep Inception-Inside-Inception Networks

no code implementations • 12 Sep 2017 • Chao Fang, Yi Shang, Dong Xu

Results: Here, a very deep neural network, the deep inception-inside-inception networks (Deep3I), is proposed for protein secondary structure prediction and a software tool was implemented using this network.

Image Classification Protein Secondary Structure Prediction

Paper
Add Code

Image Projective Invariants

no code implementations • 19 Jul 2017 • Erbo Li, Hanlin Mo, Dong Xu, Hua Li

In this paper, we propose relative projective differential invariants (RPDIs) which are invariant to general projective transformations.

Image Retrieval Retrieval

Paper
Add Code

SPFTN: A Self-Paced Fine-Tuning Network for Segmenting Objects in Weakly Labelled Videos

no code implementations • CVPR 2017 • Dingwen Zhang, Le Yang, Deyu Meng, Dong Xu, Junwei Han

Object segmentation in weakly labelled videos is an interesting yet challenging task, which aims at learning to perform category-specific video object segmentation by only using video-level tags.

Object Semantic Segmentation +3

Paper
Add Code

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

no code implementations • 26 Jun 2017 • Jun Liu, Amir Shahroudy, Dong Xu, Alex C. Kot, Gang Wang

Skeleton-based human action recognition has attracted a lot of research attention during the past few years.

Ranked #6 on One-Shot 3D Action Recognition on NTU RGB+D 120

Action Recognition One-Shot 3D Action Recognition +2

Paper
Add Code

Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective

no code implementations • 11 May 2017 • Jing Zhang, Wanqing Li, Philip Ogunbona, Dong Xu

This paper takes a problem-oriented perspective and presents a comprehensive review of transfer learning methods, both shallow and deep, for cross-dataset visual recognition.

Transfer Learning

Paper
Add Code

Shape DNA: Basic Generating Functions for Geometric Moment Invariants

no code implementations • 7 Mar 2017 • Erbo Li, Yazhou Huang, Dong Xu, Hua Li

Two fundamental building blocks or generating functions (GFs) for invariants are discovered, which are dot product and vector product of point vectors in Euclidean space.

Information Retrieval Retrieval

Paper
Add Code

Learning Multi-level Deep Representations for Image Emotion Classification

no code implementations • 22 Nov 2016 • Tianrong Rao, Min Xu, Dong Xu

The proposed MldrNet combines deep representations of different levels, i. e. image semantics, image aesthetics, and low-level visual features to effectively classify the emotion types of different kinds of images, such as abstract paintings and web images.

Classification Emotion Classification +1

Paper
Add Code

A Siamese Long Short-Term Memory Architecture for Human Re-Identification

no code implementations • European Conference on Computer Vision 2016 • Rahul Rama Varior, Bing Shuai, Jiwen Lu, Dong Xu, Gang Wang

Matching pedestrians across multiple camera views known as human re-identification (re-identification) is a challenging problem in visual surveillance.

Person Re-Identification

Paper
Add Code

Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

no code implementations • 24 Jul 2016 • Jun Liu, Amir Shahroudy, Dong Xu, Gang Wang

To handle the noise and occlusion in 3D skeleton data, we introduce new gating mechanism within LSTM to learn the reliability of the sequential input data and accordingly adjust its effect on updating the long-term context information stored in the memory cell.

Ranked #8 on Skeleton Based Action Recognition on SBU

Action Analysis Skeleton Based Action Recognition

Paper
Add Code

Full-Time Supervision based Bidirectional RNN for Factoid Question Answering

no code implementations • 19 Jun 2016 • Dong Xu, Wu-Jun Li

Hence, these existing models don't put supervision (loss or similarity calculation) at every time step, which will lose some useful information.

Question Answering

Paper
Add Code

Proximal Riemannian Pursuit for Large-Scale Trace-Norm Minimization

no code implementations • CVPR 2016 • Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi

Trace-norm regularization plays an important role in many areas such as machine learning and computer vision.

BIG-bench Machine Learning Clustering +1

Paper
Add Code

Fast Algorithms for Linear and Kernel SVM+

no code implementations • CVPR 2016 • Wen Li, Dengxin Dai, Mingkui Tan, Dong Xu, Luc van Gool

The SVM+ approach has shown excellent performance in visual recognition tasks for exploiting privileged information in the training data.

Paper
Add Code

Dimensionality-Dependent Generalization Bounds for $k$-Dimensional Coding Schemes

no code implementations • 3 Jan 2016 • Tongliang Liu, DaCheng Tao, Dong Xu

Can we obtain dimensionality-dependent generalization bounds for $k$-dimensional coding schemes that are tighter than dimensionality-independent bounds when data is in a finite-dimensional feature space?

Clustering Dictionary Learning +2

Paper
Add Code

Multi-View Domain Generalization for Visual Recognition

no code implementations • ICCV 2015 • Li Niu, Wen Li, Dong Xu

Considering the recent works show the domain generalization capability can be enhanced by fusing multiple SVM classifiers, we build upon exemplar SVMs to learn a set of SVM classifiers by using one positive sample and all negative samples in the source domain each time.

Domain Generalization

Paper
Add Code

Object-Based RGBD Image Co-Segmentation With Mutex Constraint

no code implementations • CVPR 2015 • Huazhu Fu, Dong Xu, Stephen Lin, Jiang Liu

We present an object-based co-segmentation method that takes advantage of depth data and is able to correctly handle noisy images in which the common foreground object is missing.

Object Segmentation

Paper
Add Code

Visual Recognition by Learning From Web Data: A Weakly Supervised Domain Generalization Approach

no code implementations • CVPR 2015 • Li Niu, Wen Li, Dong Xu

In this work, we formulate a new weakly supervised domain generalization problem for the visual recognition task by using loosely labeled web images/videos as training data.

Domain Generalization

Paper
Add Code

FaLRR: A Fast Low Rank Representation Solver

no code implementations • CVPR 2015 • Shijie Xiao, Wen Li, Dong Xu, DaCheng Tao

In this paper, we develop a fast LRR solver called FaLRR, by reformulating LRR as a new optimization problem with regard to factorized data (which is obtained by skinny SVD of the original data matrix).

Clustering Face Clustering

Paper
Add Code

Scalable Nuclear-norm Minimization by Subspace Pursuit Proximal Riemannian Gradient

no code implementations • 10 Mar 2015 • Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi

Nuclear-norm regularization plays a vital role in many learning tasks, such as low-rank matrix recovery (MR), and low-rank representation (LRR).

Clustering Matrix Completion

Paper
Add Code

Object-based Multiple Foreground Video Co-segmentation

no code implementations • CVPR 2014 • Huazhu Fu, Dong Xu, Bao Zhang, Stephen Lin

We present a video co-segmentation method that uses category-independent object proposals as its basic element and can extract multiple foreground objects in a video set.

Object Segmentation

Paper
Add Code

Recognizing RGB Images by Learning from RGB-D Data

no code implementations • CVPR 2014 • Lin Chen, Wen Li, Dong Xu

In this work, we propose a new framework for recognizing RGB images captured by the conventional cameras by leveraging a set of labeled RGB-D data, in which the depth features can be additionally extracted from the depth images.

Object Recognition Unsupervised Domain Adaptation

Paper
Add Code

Fusing Robust Face Region Descriptors via Multiple Metric Learning for Face Recognition in the Wild

no code implementations • CVPR 2013 • Zhen Cui, Wen Li, Dong Xu, Shiguang Shan, Xilin Chen

Spatial-Temporal Face Region Descriptor, STFRD) for images (resp.

Face Recognition Face Verification +1

Paper
Add Code

Event Recognition in Videos by Learning from Heterogeneous Web Sources

no code implementations • CVPR 2013 • Lin Chen, Lixin Duan, Dong Xu

In this work, we propose to leverage a large number of loosely labeled web videos (e. g., from YouTube) and web images (e. g., from Google/Bing image search) for visual event recognition in consumer videos without requiring any labeled consumer videos.

Domain Adaptation Image Retrieval

Paper
Add Code

Learning by Associating Ambiguously Labeled Images

no code implementations • CVPR 2013 • Zinan Zeng, Shijie Xiao, Kui Jia, Tsung-Han Chan, Shenghua Gao, Dong Xu, Yi Ma

Our framework is motivated by the observation that samples from the same class repetitively appear in the collection of ambiguously labeled training images, while they are just ambiguously labeled in each image.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.