An Efficient Spatial Pyramid (ESP) is an image model block based on a factorization principle that decomposes a standard convolution into two steps: (1) point-wise convolutions and (2) spatial pyramid of dilated convolutions. The point-wise convolutions help in reducing the computation, while the spatial pyramid of dilated convolutions re-samples the feature maps to learn the representations from large effective receptive field. This allows for increased efficiency compared to another image blocks like ResNeXt blocks and Inception modules.
Source: ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Recognition | 10 | 17.54% |
Automatic Speech Recognition (ASR) | 8 | 14.04% |
Semantic Segmentation | 5 | 8.77% |
Language Modelling | 3 | 5.26% |
Reinforcement Learning (RL) | 3 | 5.26% |
Speech Separation | 2 | 3.51% |
Decoder | 2 | 3.51% |
Real-Time Semantic Segmentation | 2 | 3.51% |
Autonomous Driving | 1 | 1.75% |
Component | Type |
|
---|---|---|
Dilated Convolution
|
Convolutions | |
Hierarchical Feature Fusion
|
Degridding | |
Pointwise Convolution
|
Convolutions |