VizWiz (VizWiz-VQA)

Introduced by Gurari et al. in VizWiz Grand Challenge: Answering Visual Questions from Blind People

The VizWiz-VQA dataset originates from a natural visual question answering setting where blind people each took an image and recorded a spoken question about it, together with 10 crowdsourced answers per visual question. The proposed challenge addresses the following two tasks for this dataset: predict the answer to a visual question and (2) predict whether a visual question cannot be answered.

Source: https://vizwiz.org/tasks-and-datasets/vqa/

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Image Captioning	VizWiz 2020 test-dev	IBM Research AI
Visual Question Answering (VQA)	VizWiz 2020 VQA	PaLI
Image Captioning	VizWiz 2020 test	IBM Research AI
Visual Question Answering (VQA)	VizWiz 2018	Colin
Visual Question Answering (VQA)	VizWiz 2020 Answerability	CLIP-Ensemble
Visual Question Answering (VQA)	VizWiz 2018 Answerability	ensemble_two_best
Visual Question Answering	VizWiz	Emu-I *