no code implementations • 7 May 2024 • Runheng Liu, Xingchen Xiao, Heyan Huang, Zewen Chi, Zhijing Wu
And the inference speed of FlashBack is up to $4\times$ faster than the prepending counterpart on a 7B LLM (Llama 2) in the runtime test.
no code implementations • 24 Oct 2023 • Yizhe Yang, Huashan Sun, Jiawei Li, Runheng Liu, Yinghao Li, Yuhang Liu, Heyan Huang, Yang Gao
Large Language Models (LLMs) have demonstrated remarkable performance across various natural language tasks, marking significant strides towards general artificial intelligence.