1 code implementation • 22 Oct 2022 • Jiaming Chen, Weixin Luo, Ran Song, Xiaolin Wei, Lin Ma, Wei zhang
This paper presents a novel hierarchical alignment model (HAM) that learns multi-granularity visual and linguistic representations in an end-to-end manner.