1 code implementation • 20 Jun 2023 • Zilun Zhang, Tiancheng Zhao, Yulong Guo, Jianwei Yin
Moreover, we present an image-text paired dataset in the field of remote sensing (RS), RS5M, which has 5 million RS images with English descriptions.
Ranked #1 on Cross-Modal Retrieval on RSITMD (using extra training data)
no code implementations • 3 Oct 2022 • Zilun Zhang, Farzad Khalvati
Many high-performance classification models utilize complex CNN-based architectures for Alzheimer's Disease classification.
no code implementations • 31 Aug 2022 • Zilun Zhang, Cuifeng Shen, Yuan Shen, Huixin Xiong, Xinyu Zhou
Although CLIP-like Visual Language Models provide a functional joint feature space for image and text, due to the limitation of the CILP-like model's image input size (e. g., 224), subtle details are lost in the feature representation if we input high-resolution images (e. g., 2240).
no code implementations • 25 Jul 2021 • Zilun Zhang, Shihao Ma, Yichun Zhang
Most few-shot learning models utilize only one modality of data.
1 code implementation • CVPR 2020 • Ling Yang, Liangliang Li, Zilun Zhang, Xinyu Zhou, Erjin Zhou, Yu Liu
To combine the distribution-level relations and instance-level relations for all examples, we construct a dual complete graph network which consists of a point graph and a distribution graph with each node standing for an example.
Ranked #2 on Few-Shot Learning on Mini-ImageNet - 1-Shot Learning