1 code implementation • 1 Mar 2023 • Chenyang Liu, Jiajun Yang, Zipeng Qi, Zhengxia Zou, Zhenwei Shi
To sufficiently utilize the extracted multi-scale features for captioning, we propose a scale-aware reinforcement (SR) module and combine it with the Transformer decoding layer to progressively utilize the features from different PDP layers.