3 code implementations • 19 Apr 2023 • Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
A major limitation for the broader scope of problems solvable by transformers is the quadratic scaling of computational complexity with input size.
3 code implementations • 14 Jul 2022 • Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
We implement a memory mechanism with no changes to Transformer model by adding special memory tokens to the input or output sequence.
1 code implementation • 20 Jun 2020 • Mikhail S. Burtsev, Yuri Kuratov, Anton Peganov, Grigory V. Sapunov
Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.
no code implementations • 7 May 2019 • Artyom Y. Sorokin, Mikhail S. Burtsev
Episodic memory plays an important role in the behavior of animals and humans.