Search Results for author: Aleksandr Mikhalev

Quantization of Large Language Models with an Overdetermined Basis

In this paper, we introduce an algorithm for data quantization based on the principles of Kashin representation.

Paper
Add Code

In this paper we generalize and extend an idea of low-rank adaptation (LoRA) of large language models (LLMs) based on Transformer architecture.

Paper
Add Code

LoRA is a technique that reduces the number of trainable parameters in a neural network by introducing low-rank adapters to linear layers.

Paper
Add Code

Also, we investigate the variance of the gradient estimate induced by the randomized matrix multiplication.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.