Search Results for author: Chengyao Wang

Found 4 papers, 3 papers with code

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

2 code implementations • 27 Mar 2024 • Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia

We try to narrow the gap by mining the potential of VLMs for better performance and any-to-any workflow from three aspects, i. e., high-resolution visual tokens, high-quality data, and VLM-guided generation.

Ranked #9 on Visual Question Answering on MM-Vet

Image Comprehension Visual Dialog +1

3,031

Paper
Code

GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding

1 code implementation • 14 Mar 2024 • Chengyao Wang, Li Jiang, Xiaoyang Wu, Zhuotao Tian, Bohao Peng, Hengshuang Zhao, Jiaya Jia

To address this issue, we propose GroupContrast, a novel approach that combines segment grouping and semantic-aware contrastive learning.

Contrastive Learning Representation Learning +2

Paper
Code

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

2 code implementations • 28 Nov 2023 • Yanwei Li, Chengyao Wang, Jiaya Jia

Current VLMs, while proficient in tasks like image captioning and visual question answering, face computational burdens when processing long videos due to the excessive visual tokens.

Ranked #6 on Zero-Shot Video Question Answer on ActivityNet-QA

Image Captioning Video-based Generative Performance Benchmarking +2

871

Paper
Code

Hierarchical Dense Correlation Distillation for Few-Shot Segmentation-Extended Abstract

no code implementations • 27 Jun 2023 • Bohao Peng, Zhuotao Tian, Xiaoyang Wu, Chengyao Wang, Shu Liu, Jingyong Su, Jiaya Jia

We hope our work can benefit broader industrial applications where novel classes with limited annotations are required to be decently identified.

Few-Shot Semantic Segmentation Segmentation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.