Search Results for author: Lluís Gómez

Found 3 papers, 2 papers with code

StacMR: Scene-Text Aware Cross-Modal Retrieval

1 code implementation • 8 Dec 2020 • Andrés Mafla, Rafael Sampaio de Rezende, Lluís Gómez, Diane Larlus, Dimosthenis Karatzas

Then, armed with this dataset, we describe several approaches which leverage scene text, including a better scene-text aware cross-modal retrieval method which uses specialized representations for text from the captions and text from the visual scene, and reconcile them in a common embedding space.

Cross-Modal Retrieval Information Retrieval +1

Paper
Code

Multimodal grid features and cell pointers for Scene Text Visual Question Answering

no code implementations • 1 Jun 2020 • Lluís Gómez, Ali Furkan Biten, Rubèn Tito, Andrés Mafla, Marçal Rusiñol, Ernest Valveny, Dimosthenis Karatzas

This paper presents a new model for the task of scene text visual question answering, in which questions about a given image can only be answered by reading and understanding scene text that is present in it.

Question Answering Visual Question Answering

Paper
Add Code

Single Shot Scene Text Retrieval

3 code implementations • ECCV 2018 • Lluís Gómez, Andrés Mafla, Marçal Rusiñol, Dimosthenis Karatzas

In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image database.

Image Retrieval Retrieval +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.