2 code implementations • 11 Apr 2024 • Jingxuan Xu, Wuyang Chen, Yao Zhao, Yunchao Wei
In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.
1 code implementation • 31 Mar 2024 • Haolin Qin, Tingfa Xu, Peifu Liu, Jingxuan Xu, Jianan Li
To address these challenges, we propose a novel approach termed the Distilled Mixed Spectral-Spatial Network (DMSSN), comprising a Distilled Spectral Encoding process and a Mixed Spectral-Spatial Transformer (MSST) feature extraction network.