1 code implementation • 24 Apr 2024 • Zinan Guo, Yanze Wu, Zhuowei Chen, Lang Chen, Qian He
We propose Pure and Lightning ID customization (PuLID), a novel tuning-free ID customization method for text-to-image generation.
1 code implementation • 11 Mar 2024 • Tianhao Qi, Shancheng Fang, Yanze Wu, Hongtao Xie, Jiawei Liu, Lang Chen, Qian He, Yongdong Zhang
The Q-Formers are trained using paired images rather than the identical target, in which the reference image and the ground-truth image are with the same style or semantics.
no code implementations • 1 Mar 2024 • Wenqi Liang, Gan Sun, Qian He, Yu Ren, Jiahua Dong, Yang Cong
Relying on large language models (LLMs), embodied robots could perform complex multimodal robot manipulation tasks from visual observations with powerful generalization ability.
no code implementations • 1 Mar 2024 • Mengqi Huang, Zhendong Mao, Mingcong Liu, Qian He, Yongdong Zhang
However, the inherent entangled influence scope of pseudo-words with the given text results in a dual-optimum paradox, i. e., the similarity of the given subjects and the controllability of the given text could not be optimal simultaneously.
no code implementations • 21 Dec 2023 • Miao Hua, Jiawei Liu, Fei Ding, Wei Liu, Jie Wu, Qian He
Diffusion-based models have demonstrated impressive capabilities for text-to-image generation and are expected for personalized applications of subject-driven generation, which require the generation of customized concepts with one or a few reference images.
no code implementations • ICCV 2023 • Yuxi Ren, Jie Wu, Peng Zhang, Manlin Zhang, Xuefeng Xiao, Qian He, Rui Wang, Min Zheng, Xin Pan
Recent years have witnessed the prevailing progress of Generative Adversarial Networks (GANs) in image-to-image translation.
no code implementations • ICCV 2023 • Tianxiang Ma, Bingchuan Li, Qian He, Jing Dong, Tieniu Tan
In this paper, we introduce a novel Geometry-aware Facial Expression Translation (GaFET) framework, which is based on parametric 3D facial representations and can stably decoupled expression.
no code implementations • 1 Jul 2023 • Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao
While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centric images, an intractable problem is how to preserve the face identity for conditioned face images.
no code implementations • 5 Feb 2023 • Shiqi Sun, Shancheng Fang, Qian He, Wei Liu
Specifically, our method co-encodes images and text into a new domain during the training phase.
1 code implementation • 3 Feb 2023 • Tianxiang Ma, Bingchuan Li, Qian He, Jing Dong, Tieniu Tan
CNeRF divides the image by semantic regions and learns an independent neural radiance field for each region, and finally fuses them and renders the complete image.
no code implementations • 31 Jan 2023 • Bingchuan Li, Tianxiang Ma, Peng Zhang, Miao Hua, Wei Liu, Qian He, Zili Yi
Specifically, in Phase I, a W-space-oriented StyleGAN inversion network is trained and used to perform image inversion and editing, which assures the editability but sacrifices reconstruction quality.
1 code implementation • 8 Jan 2023 • Bin Tang, Zhengyi Liu, Yacheng Tan, Qian He
To solve the second problem, a dual-direction short connection fusion module is used to optimize the output features of HRFormer, thereby enhancing the detailed representation of objects at the output level.
no code implementations • 19 Aug 2022 • Tailin Chen, Desen Zhou, Jian Wang, Shidong Wang, Qian He, Chuanyang Hu, Errui Ding, Yu Guan, Xuming He
In this paper, we study the problem of one-shot skeleton-based action recognition, which poses unique challenges in learning transferable representation from base classes to novel classes, particularly for fine-grained actions.
no code implementations • 10 Aug 2022 • Ijaz Gul, Changyue Liu, Yuan Xi, Zhicheng Du, Shiyao Zhai, Zhengyang Lei, Chen Qun, Muhammad Akmal Raheem, Qian He, Zhang Haihui, Canyang Zhang, Runming Wang, Sanyang Han, Du Ke, Peiwu Qin
Objectives The review is dedicated to evaluate the current monkeypox virus (MPXV) detection methods, discuss their pros and cons, and provide recommended solutions to the problems.
no code implementations • 24 Jun 2022 • Chuyu Zhang, Chuanyang Hu, Ruijie Xu, Zhitong Gao, Qian He, Xuming He
Our insight is to utilize mutual information to measure the relation between seen classes and unseen classes in a restricted label space and maximizing mutual information promotes transferring semantic knowledge.
1 code implementation • 12 Apr 2022 • Zhengyi Liu, Yacheng Tan, Qian He, Yun Xiao
It is driven by Swin Transformer to extract the hierarchical features, boosted by attention mechanism to bridge the gap between two modalities, and guided by edge information to sharp the contour of salient object.
no code implementations • CVPR 2022 • Wei Liu, Fangyue Liu, Fei Ding, Qian He, Zili Yi
The cross-modality encoder is pre-trained in a self-supervised manner to allow effective capture of cross- and intra-modality correlations, which facilitates the content-style disentanglement and modeling style representations of all scales (stroke-level, component-level and character-level).
no code implementations • CVPR 2022 • Chao Xu, Jiangning Zhang, Miao Hua, Qian He, Zili Yi, Yong liu
This paper presents a novel Region-Aware Face Swapping (RAFSwap) network to achieve identity-consistent harmonious high-resolution face generation in a local-global manner: \textbf{1)} Local Facial Region-Aware (FRA) branch augments local identity-relevant features by introducing the Transformer to effectively model misaligned cross-scale semantic interaction.
2 code implementations • 1 Mar 2022 • ZiHao Wang, Wei Liu, Qian He, Xinglong Wu, Zili Yi
Once trained, the transformer can generate coherent image tokens based on the text embedding extracted from the text encoder of CLIP upon an input text.
1 code implementation • 3 Feb 2022 • Weizhen Liu, Qian He, Xuming He
Weakly supervised nuclei segmentation is a critical problem for pathological image analysis and greatly benefits the community due to the significant reduction of labeling cost.
1 code implementation • 22 Sep 2021 • Bingchuan Li, Shaofei Cai, Wei Liu, Peng Zhang, Qian He, Miao Hua, Zili Yi
To address these limitations, we design a Dynamic Style Manipulation Network (DyStyle) whose structure and parameters vary by input samples, to perform nonlinear and adaptive manipulation of latent codes for flexible and precise attribute control.
1 code implementation • 22 Sep 2021 • Miao Hua, Lijie Liu, Ziyang Cheng, Qian He, Bingchuan Li, Zili Yi
Whereas, this technique does not satisfy the requirements of facial parts removal, as it is hard to obtain ``ground-truth'' images with real ``blank'' faces.
1 code implementation • 9 Sep 2021 • Qian He, Desen Zhou, Bo Wan, Xuming He
To address those challenges, we adopt a primitive-based representation for 3D object, and propose a two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module.
1 code implementation • 27 Apr 2021 • Qian He, Shuailin Li, Xuming He
Moreover, we introduce a weak annotation scheme with a hybrid label design for volumetric images, which improves model learning without increasing the overall annotation cost.
no code implementations • 30 Jul 2019 • Hengkai Guo, Wenji Wang, Guanjun Guo, Huaxia Li, Jiachen Liu, Qian He, Xuefeng Xiao
While propagation-based approaches have achieved state-of-the-art performance for video object segmentation, the literature lacks a fair comparison of different methods using the same settings.
no code implementations • 19 Feb 2019 • Nouamane Laanait, Qian He, Albina Y. Borisevich
Deep learning has demonstrated superb efficacy in processing imaging data, yet its suitability in solving challenging inverse problems in scientific imaging has not been fully explored.