1 code implementation • 7 May 2024 • Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang
However, due to a quadratic increase in memory during generating ultra-high-resolution images (e. g. 4096*4096), the resolution of generated images is often limited to 1024*1024.
no code implementations • 8 Mar 2024 • Wendi Zheng, Jiayan Teng, Zhuoyi Yang, Weihan Wang, Jidong Chen, Xiaotao Gu, Yuxiao Dong, Ming Ding, Jie Tang
Recent advancements in text-to-image generative systems have been largely driven by diffusion models.
1 code implementation • 23 Feb 2024 • Zhefan Wang, Yuanqing Yu, Wendi Zheng, Weizhi Ma, Min Zhang
LLM-based agents have gained considerable attention for their decision-making skills and ability to handle complex tasks.
1 code implementation • 4 Sep 2023 • Jiayan Teng, Wendi Zheng, Ming Ding, Wenyi Hong, Jianqiao Wangni, Zhuoyi Yang, Jie Tang
Diffusion models achieved great success in image synthesis, but still face challenges in high-resolution generation.
Ranked #1 on Image Generation on CelebA-HQ 256x256
10 code implementations • 5 Oct 2022 • Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, WenGuang Chen, Peng Zhang, Yuxiao Dong, Jie Tang
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.
Ranked #1 on Language Modelling on CLUE (OCNLI_50K)
1 code implementation • 29 May 2022 • Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
Large-scale pretrained transformers have created milestones in text (GPT-3) and text-to-image (DALL-E and CogView) generation.
Ranked #12 on Video Generation on UCF-101
1 code implementation • 28 Apr 2022 • Ming Ding, Wendi Zheng, Wenyi Hong, Jie Tang
The development of the transformer-based text-to-image models are impeded by its slow generation and complexity for high-resolution images.
Ranked #44 on Text-to-Image Generation on MS COCO
4 code implementations • NeurIPS 2021 • Ming Ding, Zhuoyi Yang, Wenyi Hong, Wendi Zheng, Chang Zhou, Da Yin, Junyang Lin, Xu Zou, Zhou Shao, Hongxia Yang, Jie Tang
Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding.
Ranked #56 on Text-to-Image Generation on MS COCO (using extra training data)