Generalized Referring Expression Comprehension

5 papers with code • 1 benchmarks • 1 datasets

Generalized Referring Expression Comprehension (GREC) allows expressions indicating any number of target objects. GREC takes an image and a referring expression as input, and requires bounding box(es) prediction of the target object(s).

Benchmarks

Add a Result

These leaderboards are used to track progress in Generalized Referring Expression Comprehension

Trend	Dataset	Best Model	Paper	Code	Compare
	gRefCOCO	UNINEXT			See all

Datasets

gRefCOCO

Most implemented papers

Most implemented Social Latest No code

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

ashkamath/mdetr • • 26 Apr 2021

We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting.

Paper
Code

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

luogen1996/MCN • • CVPR 2020

In addition, we address a key challenge in this multi-task setup, i. e., the prediction conflict, with two innovative designs namely, Consistency Energy Maximization (CEM) and Adaptive Soft Non-Located Suppression (ASNLS).

Paper
Code

Vision-Language Transformer and Query Generation for Referring Segmentation

henghuiding/Vision-Language-Transformer • • ICCV 2021

We introduce transformer and multi-head attention to build a network with an encoder-decoder attention mechanism architecture that "queries" the given image with the language expression.

Paper
Code

Universal Instance Perception as Object Discovery and Retrieval

MasterBin-IIAU/UNINEXT • • CVPR 2023

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.

Paper
Code

GREC: Generalized Referring Expression Comprehension

henghuiding/grefcoco • • 30 Aug 2023

This dataset encompasses a range of expressions: those referring to multiple targets, expressions with no specific target, and the single-target expressions.

Paper
Code

Generalized Referring Expression Comprehension

Benchmarks Add a Result

Datasets

Most implemented papers

MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding

Multi-task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

Vision-Language Transformer and Query Generation for Referring Segmentation

Universal Instance Perception as Object Discovery and Retrieval

GREC: Generalized Referring Expression Comprehension

Content

Benchmarks

Add a Result