Paper tables with annotated results for Challenges and Prospects in Vision and Language Research

Paper

Challenges and Prospects in Vision and Language Research

Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence. Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding. However, rather than behaving as visual Turing tests, recent studies have demonstrated state-of-the-art systems are achieving good performance through flaws in datasets and evaluation procedures. We review the current state of affairs and outline a path forward.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Challenges and Prospects in Vision and Language Research

Reader Guidelines

Editor Guidelines