no code implementations • 30 May 2023 • Lior Bracha, Eitan Shaar, Aviv Shamsian, Ethan Fetaya, Gal Chechik
Our results highlight the potential of using pre-trained visual-semantic models for generating high-quality contextual descriptions.
Referring Expression Referring expression generation