Search Results for author: Boqiang Duan

Found 1 papers, 1 papers with code

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation

2 code implementations • 31 Dec 2021 • Han Zhang, Weichong Yin, Yewei Fang, Lanxin Li, Boqiang Duan, Zhihua Wu, Yu Sun, Hao Tian, Hua Wu, Haifeng Wang

To explore the landscape of large-scale pre-training for bidirectional text-image generation, we train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs which achieves state-of-the-art performance for both text-to-image and image-to-text tasks, obtaining an FID of 7. 9 on MS-COCO for text-to-image synthesis and best results on COCO-CN and AIC-ICC for image captioning.

Ranked #42 on Text-to-Image Generation on MS COCO

Image Captioning Quantization +2

11,496

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.