no code implementations • 4 Apr 2023 • Gaochen Dong, Wei Chen
This method mitigates data distribution deviation caused by quantization, eliminating the requirement for retraining.
no code implementations • 16 Mar 2023 • Gaochen Dong, Wei Chen
With the popularity of the recent Transformer-based models represented by BERT, GPT-3 and ChatGPT, there has been state-of-the-art performance in a range of natural language processing tasks.