Search Results for author: Lumin Xu

Found 9 papers, 6 papers with code

UniFS: Universal Few-shot Instance Perception with Point Representations

no code implementations • 30 Apr 2024 • Sheng Jin, Ruijie Yao, Lumin Xu, Wentao Liu, Chen Qian, Ji Wu, Ping Luo

In this paper, we propose UniFS, a universal few-shot instance perception model that unifies a wide range of instance perception tasks by reformulating them into a dynamic point representation learning framework.

Few-Shot Learning Instance Segmentation +5

Paper
Add Code

CLIM: Contrastive Language-Image Mosaic for Region Representation

1 code implementation • 18 Dec 2023 • Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Wentao Liu, Chen Change Loy

Our experimental results demonstrate that CLIM improves different baseline open-vocabulary object detectors by a large margin on both OV-COCO and OV-LVIS benchmarks.

Ranked #6 on Open Vocabulary Object Detection on LVIS v1.0

Object object-detection +1

Paper
Code

Open-Vocabulary Animal Keypoint Detection with Semantic-feature Matching

no code implementations • 8 Oct 2023 • Hao Zhang, Lumin Xu, Shenqi Lai, Wenqi Shao, Nanning Zheng, Ping Luo, Yu Qiao, Kaipeng Zhang

Current image-based keypoint detection methods for animal (including human) bodies and faces are generally divided into full-supervised and few-shot class-agnostic approaches.

Keypoint Detection

Paper
Add Code

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

1 code implementation • 2 Oct 2023 • Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy

However, when transferring the vision-language alignment of CLIP from global image representation to local region representation for the open-vocabulary dense prediction tasks, CLIP ViTs suffer from the domain shift from full images to local image regions.

Ranked #3 on Open Vocabulary Semantic Segmentation on PASCAL Context-59

Image Classification Image Segmentation +7

141

Paper
Code

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

no code implementations • 28 Aug 2023 • Ruijie Yao, Sheng Jin, Lumin Xu, Wang Zeng, Wentao Liu, Chen Qian, Ping Luo, Ji Wu

Multi-Label Image Recognition (MLIR) is a challenging task that aims to predict multiple object labels in a single image while modeling the complex relationships between labels and image regions.

graph construction

Paper
Add Code

ZoomNAS: Searching for Whole-body Human Pose Estimation in the Wild

1 code implementation • 23 Aug 2022 • Lumin Xu, Sheng Jin, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

We propose a single-network approach, termed ZoomNet, to take into account the hierarchical structure of the full human body and solve the scale variation of different body parts.

Ranked #2 on 2D Human Pose Estimation on COCO-WholeBody

2D Human Pose Estimation Neural Architecture Search +1

718

Paper
Code

Pose for Everything: Towards Category-Agnostic Pose Estimation

1 code implementation • 21 Jul 2022 • Lumin Xu, Sheng Jin, Wang Zeng, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo, Xiaogang Wang

In this paper, we introduce the task of Category-Agnostic Pose Estimation (CAPE), which aims to create a pose estimation model capable of detecting the pose of any class of object given only a few samples with keypoint definition.

Ranked #4 on 2D Pose Estimation on MP-100

Category-Agnostic Pose Estimation Pose Estimation

186

Paper
Code

ViPNAS: Efficient Video Pose Estimation via Neural Architecture Search

4 code implementations • CVPR 2021 • Lumin Xu, Yingda Guan, Sheng Jin, Wentao Liu, Chen Qian, Ping Luo, Wanli Ouyang, Xiaogang Wang

Human pose estimation has achieved significant progress in recent years.

Ranked #23 on Pose Estimation on COCO test-dev

Neural Architecture Search Pose Estimation

5,125

Paper
Code

Whole-Body Human Pose Estimation in the Wild

2 code implementations • ECCV 2020 • Sheng Jin, Lumin Xu, Jin Xu, Can Wang, Wentao Liu, Chen Qian, Wanli Ouyang, Ping Luo

This paper investigates the task of 2D human whole-body pose estimation, which aims to localize dense landmarks on the entire human body including face, hands, body, and feet.

Ranked #8 on 2D Human Pose Estimation on COCO-WholeBody

2D Human Pose Estimation Facial Landmark Detection +2

5,125

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.