1 code implementation • 7 Feb 2024 • Gilles Baechler, Srinivas Sunkara, Maria Wang, Fedir Zubach, Hassan Mansoor, Vincent Etter, Victor Cărbune, Jason Lin, Jindong Chen, Abhanshu Sharma
At the heart of this mixture is a novel screen annotation task in which the model has to identify the type and location of UI elements.
Ranked #3 on Visual Question Answering (VQA) on InfographicVQA (using extra training data)
1 code implementation • 16 Sep 2022 • Yu-Chung Hsiao, Fedir Zubach, Maria Wang, Jindong Chen
We present a new task and dataset, ScreenQA, for screen content understanding via question answering.