no code implementations • 28 Dec 2023 • Wenyi Wu, Qi Li, Wenliang Zhong, Junzhou Huang
Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance.
no code implementations • 30 Aug 2023 • Wenyi Wu, Karim Bouyarmane, Ismail Tutar
We present Catalog Phrase Grounding (CPG), a model that can associate product textual data (title, brands) into corresponding regions of product images (isolated product region, brand logo region) for e-commerce vision-language applications.