Temporal Pyramid Network, or TPN, is a pyramid level module for action recognition at the feature-level, which can be flexibly integrated into 2D or 3D backbone networks in a plug-and-play manner. The source of features and the fusion of features form a feature hierarchy for the backbone so that it can capture action instances at various tempos. In the TPN, a Backbone Network is used to extract multiple level features, a Spatial Semantic Modulation spatially downsamples features to align semantics, a Temporal Rate Modulation temporally downsamples features to adjust relative tempo among levels, Information Flow aggregates features in various directions to enhance and enrich level-wise representations and Final Prediction rescales and concatenates all levels of pyramid along channel dimension.
Source: Temporal Pyramid Network for Action RecognitionPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Object Detection | 2 | 20.00% |
Domain Adaptation | 1 | 10.00% |
Unsupervised Domain Adaptation | 1 | 10.00% |
Action Classification | 1 | 10.00% |
Action Detection | 1 | 10.00% |
Few-Shot Learning | 1 | 10.00% |
Few-Shot Video Object Detection | 1 | 10.00% |
Video Object Detection | 1 | 10.00% |
Action Recognition | 1 | 10.00% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |