Search Results for author: Yang Zhan

Found 4 papers, 4 papers with code

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model

1 code implementation • 18 Jan 2024 • Yang Zhan, Zhitong Xiong, Yuan Yuan

Specifically, after projecting RS visual features to the language domain via an alignment layer, they are fed jointly with task-specific instructions into an LLM-based RS decoder to predict answers for RS open-ended tasks.

Instruction Following Language Modelling +2

Paper
Code

Mono3DVG: 3D Visual Grounding in Monocular Images

1 code implementation • 13 Dec 2023 • Yang Zhan, Yuan Yuan, Zhitong Xiong

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

Decoder Object +2

Paper
Code

Parameter-Efficient Transfer Learning for Remote Sensing Image-Text Retrieval

1 code implementation • 24 Aug 2023 • Yuan Yuan, Yang Zhan, Zhitong Xiong

To address this issue, in this work, we investigate the parameter-efficient transfer learning (PETL) method to effectively and efficiently transfer visual-language knowledge from the natural domain to the RS domain on the image-text retrieval task.

Ranked #3 on Cross-Modal Retrieval on RSICD

Image-text matching Retrieval +2

Paper
Code

RSVG: Exploring Data and Models for Visual Grounding on Remote Sensing Data

1 code implementation • 23 Oct 2022 • Yang Zhan, Zhitong Xiong, Yuan Yuan

However, the object-level visual grounding on RS images is still under-explored.

Image Captioning Question Answering +4

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.