Search Results for author: Haoran Duan

Found 14 papers, 8 papers with code

Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

1 code implementation • 18 May 2024 • Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan

In this work, we propose a novel Trajectory Score Matching (TSM) method that aims to solve the pseudo ground truth inconsistency problem caused by the accumulated error in Interval Score Matching (ISM) when using the Denoising Diffusion Implicit Models (DDIM) inversion process.

3D Generation Denoising +1

Paper
Code

From Sora What We Can See: A Survey of Text-to-Video Generation

1 code implementation • 17 May 2024 • Rui Sun, Yumin Zhang, Tejal Shah, Jiahao Sun, Shuoying Zhang, Wenqi Li, Haoran Duan, Bo Wei, Rajiv Ranjan

With impressive achievements made, artificial intelligence is on the path forward to artificial general intelligence.

Text-to-Video Generation Video Generation

Paper
Code

Sentinel-Guided Zero-Shot Learning: A Collaborative Paradigm without Real Data Exposure

no code implementations • 14 Mar 2024 • Fan Wan, Xingyu Miao, Haoran Duan, Jingjing Deng, Rui Gao, Yang Long

With increasing concerns over data privacy and model copyrights, especially in the context of collaborations between AI service providers and data owners, an innovative SG-ZSL paradigm is proposed in this work.

Zero-Shot Learning

Paper
Add Code

Pixel Sentence Representation Learning

1 code implementation • 13 Feb 2024 • Chenghao Xiao, Zhuoxu Huang, Danlu Chen, G Thomas Hudson, Yizhi Li, Haoran Duan, Chenghua Lin, Jie Fu, Jungong Han, Noura Al Moubayed

To our knowledge, this is the first representation learning method devoid of traditional language models for understanding sentence and document semantics, marking a stride closer to human-like textual comprehension.

Natural Language Inference Representation Learning +3

Paper
Code

ConRF: Zero-shot Stylization of 3D Scenes with Conditioned Radiation Fields

1 code implementation • 2 Feb 2024 • Xingyu Miao, Yang Bai, Haoran Duan, Fan Wan, Yawen Huang, Yang Long, Yefeng Zheng

Most of the existing works on arbitrary 3D NeRF style transfer required retraining on each single style condition.

Style Transfer

Paper
Code

CTNeRF: Cross-Time Transformer for Dynamic Neural Radiance Field from Monocular Video

no code implementations • 10 Jan 2024 • Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Yang Long, Yefeng Zheng

The goal of our work is to generate high-quality novel views from monocular videos of complex and dynamic scenes.

Paper
Add Code

Dual Feature Augmentation Network for Generalized Zero-shot Learning

1 code implementation • 25 Sep 2023 • Lei Xiang, Yuan Zhou, Haoran Duan, Yang Long

To address these issues, we propose a novel Dual Feature Augmentation Network (DFAN), which comprises two feature augmentation modules, one for visual features and the other for semantic features.

Attribute Generalized Zero-Shot Learning

Paper
Code

UniHead: Unifying Multi-Perception for Detection Heads

1 code implementation • 23 Sep 2023 • Hantao Zhou, Rui Yang, Yachao Zhang, Haoran Duan, Yawen Huang, Runze Hu, Xiu Li, Yefeng Zheng

More precisely, our approach (1) introduces deformation perception, enabling the model to adaptively sample object features; (2) proposes a Dual-axial Aggregation Transformer (DAT) to adeptly model long-range dependencies, thereby achieving global perception; and (3) devises a Cross-task Interaction Transformer (CIT) that facilitates interaction between the classification and localization branches, thus aligning the two tasks.

Paper
Code

DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume

1 code implementation • 14 Aug 2023 • Xingyu Miao, Yang Bai, Haoran Duan, Yawen Huang, Fan Wan, Xinxing Xu, Yang Long, Yefeng Zheng

Nevertheless, the dynamic cost volume inevitably generates extra occlusions and noise, thus we alleviate this by designing a fusion module that makes static and dynamic cost volumes compensate for each other.

Ranked #3 on Unsupervised Monocular Depth Estimation on Cityscapes

Monocular Depth Estimation Optical Flow Estimation +1

Paper
Code

Knowing the Past to Predict the Future: Reinforcement Virtual Learning

no code implementations • 2 Nov 2022 • Peng Zhang, Yawen Huang, Bingzhang Hu, Shizheng Wang, Haoran Duan, Noura Al Moubayed, Yefeng Zheng, Yang Long

Reinforcement Learning (RL)-based control system has received considerable attention in recent decades.

Reinforcement Learning (RL)

Paper
Add Code

Absolute Zero-Shot Learning

no code implementations • 23 Feb 2022 • Rui Gao, Fan Wan, Daniel Organisciak, Jiyao Pu, Junyan Wang, Haoran Duan, Peng Zhang, Xingsong Hou, Yang Long

Considering the increasing concerns about data copyright and privacy issues, we present a novel Absolute Zero-Shot Learning (AZSL) paradigm, i. e., training a classifier with zero real data.

Transfer Learning Zero-Shot Learning

Paper
Add Code

Semi-Supervised Crowd Counting from Unlabeled Data

no code implementations • 31 Aug 2021 • Haoran Duan, Fan Wan, Rui Sun, Zeyu Wang, Varun Ojha, Yu Guan, Hubert P. H. Shum, Bingzhang Hu, Yang Long

Our method achieved competitive performance in semi-supervised learning approaches on these crowd counting datasets.

Crowd Counting

Paper
Add Code

EfficientTDNN: Efficient Architecture Search for Speaker Recognition

1 code implementation • 25 Mar 2021 • Rui Wang, Zhihua Wei, Haoran Duan, Shouling Ji, Yang Long, Zhen Hong

Compared with hand-designed approaches, neural architecture search (NAS) appears as a practical technique in automating the manual architecture design process and has attracted increasing interest in spoken language processing tasks such as speaker recognition.

Data Augmentation Network Pruning +2

Paper
Code

SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting

no code implementations • 9 Aug 2020 • Haoran Duan, Shidong Wang, Yu Guan

To obtain the appropriate crowd representation, in this work we proposed SOFA-Net(Second-Order and First-order Attention Network): second-order statistics were extracted to retain selectivity of the channel-wise spatial information for dense heads while first-order statistics, which can enhance the feature discrimination for the heads' areas, were used as complementary information.

Crowd Counting

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.