2 code implementations • 27 Mar 2024 • Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia
We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.
Ranked #9 on Visual Question Answering on MM-Vet
1 code implementation • 7 Dec 2023 • Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia
Without tuning on LLaVA-v1. 5, our method secured 70. 7 in the MMBench test and 1552. 5 in MME-perception.
no code implementations • 1 Jun 2023 • Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
Our method, dubbed Make-Your-Video, involves joint-conditional video generation using a Latent Diffusion Model that is pre-trained for still image synthesis and then promoted for video generation with the introduction of temporal modules.
2 code implementations • NeurIPS 2023 • Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
1 code implementation • 8 Mar 2023 • Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
This paper presents Video-P2P, a novel framework for real-world video editing with cross-attention control.
1 code implementation • CVPR 2023 • Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
In this paper, we propose to cast speech-driven facial animation as a code query task in a finite proxy space of the learned codebook, which effectively promotes the vividness of the generated motions by reducing the cross-modal mapping uncertainty.
Ranked #4 on 3D Face Animation on BEAT2
1 code implementation • CVPR 2023 • Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia
We propose a ray registration process based on the stylized reference view to obtain pseudo-ray supervision in novel views.
1 code implementation • CVPR 2022 • Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, Bei Yu
Domain generalization refers to the problem of training a model from a collection of different source domains that can directly generalize to the unseen target domains.
Ranked #17 on Domain Generalization on PACS
1 code implementation • CVPR 2022 • Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia
To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.