1 code implementation • 16 Oct 2021 • Frederick Liu, Terry Huang, Shihang Lyu, Siamak Shakeri, Hongkun Yu, Jing Li
Pre-trained encoder-decoder transformer architectures have become increasingly popular recently with the advent of T5 models.
Decoder Language Modelling +3