1 code implementation • 18 Dec 2023 • Ziyi Chen, Xiaocong Yang, Jiacheng Lin, Chenkai Sun, Kevin Chen-Chuan Chang, Jie Huang
Introduced to enhance the efficiency of large language model (LLM) inference, speculative decoding operates by having a smaller model generate a draft.
1 code implementation • 10 Oct 2022 • Xiaocong Yang, James Y. Huang, Wenxuan Zhou, Muhao Chen
Parameter-efficient tuning aims at updating only a small subset of parameters when adapting a pretrained model to downstream tasks.
1 code implementation • 7 Nov 2021 • Xingcheng Yao, Yanan Zheng, Xiaocong Yang, Zhilin Yang
Pretrained language models have become the standard approach for many NLP tasks due to strong performance, but they are very expensive to train.
2 code implementations • 3 Aug 2021 • Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang
Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones.