no code implementations • 1 Nov 2023 • Md Farhan Ishmam, Md Sakib Hossain Shovon, M. F. Mridha, Nilanjan Dey
The multimodal task of Visual Question Answering (VQA) encompassing elements of Computer Vision (CV) and Natural Language Processing (NLP), aims to generate answers to questions on any visual input.