no code implementations • 1 Jan 2021 • Mikhail Burtsev, Yurii Kuratov, Anton Peganov, Grigory V. Sapunov
Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.
1 code implementation • 20 Jun 2020 • Mikhail S. Burtsev, Yuri Kuratov, Anton Peganov, Grigory V. Sapunov
Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.