no code implementations • CVPR 2023 • Qian Jiang, Changyou Chen, Han Zhao, Liqun Chen, Qing Ping, Son Dinh Tran, Yi Xu, Belinda Zeng, Trishul Chilimbi
Hence we advocate that the key of better performance lies in meaningful latent modality structures instead of perfect modality alignment.
Few-Shot Image Classification Open-Ended Question Answering +6