Subformer

Introduced by Reid et al. in Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Subformer is a Transformer that combines sandwich-style parameter sharing, which overcomes naive cross-layer parameter sharing in generative models, and self-attentive embedding factorization (SAFE). In SAFE, a small self-attention layer is used to reduce embedding parameter count.

Source: Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Abstractive Text Summarization	2	22.22%
Language Modelling	2	22.22%
Machine Translation	2	22.22%
Translation	2	22.22%
Graph Representation Learning	1	11.11%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Transformers