GShard is a intra-layer parallel distributed method. It consists of set of simple APIs for annotations, and a compiler extension in XLA for automatic parallelization.
Source: GShard: Scaling Giant Models with Conditional Computation and Automatic ShardingPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 1 | 11.11% |
Large Language Model | 1 | 11.11% |
Multi-Task Learning | 1 | 11.11% |
Information Retrieval | 1 | 11.11% |
Quantization | 1 | 11.11% |
Retrieval | 1 | 11.11% |
Machine Translation | 1 | 11.11% |
Playing the Game of 2048 | 1 | 11.11% |
Translation | 1 | 11.11% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |