Search Results for author: Wenyi Wu

Found 2 papers, 0 papers with code

MIVC: Multiple Instance Visual Component for Visual-Language Models

no code implementations28 Dec 2023 Wenyi Wu, Qi Li, Wenliang Zhong, Junzhou Huang

Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance.

Question Answering Visual Question Answering

Catalog Phrase Grounding (CPG): Grounding of Product Textual Attributes in Product Images for e-commerce Vision-Language Applications

no code implementations30 Aug 2023 Wenyi Wu, Karim Bouyarmane, Ismail Tutar

We present Catalog Phrase Grounding (CPG), a model that can associate product textual data (title, brands) into corresponding regions of product images (isolated product region, brand logo region) for e-commerce vision-language applications.

Decoder object-detection +2

Cannot find the paper you are looking for? You can Submit a new open access paper.