Search Results for author: Georgii Novikov

Found 2 papers, 1 papers with code

Efficient GPT Model Pre-training using Tensor Train Matrix Representation

no code implementations • 5 Jun 2023 • Viktoriia Chekalina, Georgii Novikov, Julia Gusak, Ivan Oseledets, Alexander Panchenko

On the downstream tasks, including language understanding and text summarization, the model performs similarly to the original GPT-2 model.

Language Modelling Text Summarization

Paper
Add Code

Few-Bit Backward: Quantized Gradients of Activation Functions for Memory Footprint Reduction

2 code implementations • 1 Feb 2022 • Georgii Novikov, Daniel Bershatsky, Julia Gusak, Alex Shonenkov, Denis Dimitrov, Ivan Oseledets

Every modern neural network model has quite a few pointwise nonlinearities in its architecture, and such operation induces additional memory costs which -- as we show -- can be significantly reduced by quantization of the gradients.

Neural Network Compression Quantization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.