no code implementations • 23 May 2024 • Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano
We introduce Bitune, a method that improves instruction-tuning of pretrained decoder-only large language models, leading to consistent gains on downstream tasks.
no code implementations • 17 Oct 2023 • Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano
Low-rank adapation (LoRA) is a popular method that reduces the number of trainable parameters when finetuning large language models, but still faces acute storage challenges when scaling to even larger models or deploying numerous per-user or per-task adapted models.