Search Results for author: Jianyuan Guo

Found 34 papers, 27 papers with code

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution

1 code implementation • 27 Feb 2024 • Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang

In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference.

Image Super-Resolution

Paper
Code

Data-efficient Large Vision Models through Sequential Autoregression

1 code implementation • 7 Feb 2024 • Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu

Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.

Paper
Code

Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

1 code implementation • 6 Feb 2024 • Jianyuan Guo, Hanting Chen, Chengcheng Wang, Kai Han, Chang Xu, Yunhe Wang

Recent advancements in large language models have sparked interest in their extraordinary and near-superhuman capabilities, leading researchers to explore methods for evaluating and optimizing these abilities, which is called superalignment.

Few-Shot Learning Knowledge Distillation +1

Paper
Code

A Survey on Transformer Compression

no code implementations • 5 Feb 2024 • Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, DaCheng Tao

Model compression methods reduce the memory and computational cost of Transformer, which is a necessary step to implement large language/vision models on practical devices.

Knowledge Distillation Model Compression +1

Paper
Add Code

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations • 27 Dec 2023 • Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

Paper
Add Code

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

1 code implementation • NeurIPS 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu

To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.

Knowledge Distillation

Paper
Code

Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism

3 code implementations • NeurIPS 2023 • Chengcheng Wang, wei he, Ying Nie, Jianyuan Guo, Chuanjian Liu, Kai Han, Yunhe Wang

In the past years, YOLO-series models have emerged as the leading approaches in the area of real-time object detection.

object-detection Real-Time Object Detection

1,115

Paper
Code

ParameterNet: Parameters Are All You Need

no code implementations • 26 Jun 2023 • Kai Han, Yunhe Wang, Jianyuan Guo, Enhua Wu

In the language domain, LLaMA-1B enhanced with ParameterNet achieves 2\% higher accuracy over vanilla LLaMA.

Paper
Add Code

VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale

1 code implementation • 25 May 2023 • Zhiwei Hao, Jianyuan Guo, Kai Han, Han Hu, Chang Xu, Yunhe Wang

The tremendous success of large models trained on extensive datasets demonstrates that scale is a key ingredient in achieving superior results.

Data Augmentation Knowledge Distillation

Paper
Code

VanillaNet: the Power of Minimalism in Deep Learning

4 code implementations • NeurIPS 2023 • Hanting Chen, Yunhe Wang, Jianyuan Guo, DaCheng Tao

In this study, we introduce VanillaNet, a neural network architecture that embraces elegance in design.

Philosophy

793

Paper
Code

Masked Image Modeling with Local Multi-Scale Reconstruction

1 code implementation • CVPR 2023 • Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han

The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.

Representation Learning

1,115

Paper
Code

FastMIM: Expediting Masked Image Modeling Pre-training for Vision

1 code implementation • 13 Dec 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Yunhe Wang, Chang Xu

This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.

Paper
Code

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

15 code implementations • 23 Nov 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang

The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.

29,846

Paper
Code

Hierarchical Relational Learning for Few-Shot Knowledge Graph Completion

no code implementations • 2 Sep 2022 • Han Wu, Jie Yin, Bala Rajaratnam, Jianyuan Guo

By jointly capturing three levels of relational information (entity-level, triplet-level and context-level), HiRe can effectively learn and refine the meta representation of few-shot relations, and consequently generalize very well to new unseen relations.

Relational Reasoning

Paper
Add Code

Vision GNN: An Image is Worth Graph of Nodes

11 code implementations • 1 Jun 2022 • Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, Enhua Wu

In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph-level feature for visual tasks.

Ranked #365 on Image Classification on ImageNet

Image Classification Object Detection

3,816

Paper
Code

Brain-inspired Multilayer Perceptron with Spiking Neurons

4 code implementations • CVPR 2022 • Wenshuo Li, Hanting Chen, Jianyuan Guo, Ziyang Zhang, Yunhe Wang

However, due to the simplicity of their structures, the performance highly depends on the local features communication machenism.

Inductive Bias

3,816

Paper
Code

GhostNets on Heterogeneous Devices via Cheap Operations

8 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian

The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.

3,816

Paper
Code

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture

1 code implementation • 4 Jan 2022 • Kai Han, Jianyuan Guo, Yehui Tang, Yunhe Wang

We hope this new baseline will be helpful to the further research and application of vision transformer.

3,816

Paper
Code

An Image Patch is a Wave: Phase-Aware Vision MLP

10 code implementations • CVPR 2022 • Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang

To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.

Image Classification object-detection +2

3,816

Paper
Code

Hire-MLP: Vision MLP via Hierarchical Rearrangement

10 code implementations • CVPR 2022 • Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang

Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.

Image Classification object-detection +2

161

Paper
Code

CMT: Convolutional Neural Networks Meet Vision Transformers

14 code implementations • CVPR 2022 • Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen, Yunhe Wang, Chang Xu

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image.

559

Paper
Code

Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation

1 code implementation • 3 Jul 2021 • Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang

Specifically, we train a tiny student model to match a pre-trained teacher model in the patch-level manifold space.

Knowledge Distillation Model Compression +1

Paper
Code

Positive-Unlabeled Data Purification in the Wild for Object Detection

no code implementations • CVPR 2021 • Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Xinghao Chen, Chunjing Xu, Chang Xu, Yunhe Wang

In this paper, we present a positive-unlabeled learning based scheme to expand training data by purifying valuable images from massive unlabeled ones, where the original training data are viewed as positive data and the unlabeled images in the wild are unlabeled data.

Knowledge Distillation object-detection +1

Paper
Add Code

Patch Slimming for Efficient Vision Transformers

no code implementations • CVPR 2022 • Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao

We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.

Ranked #8 on Efficient ViTs on ImageNet-1K (with DeiT-T)

Efficient ViTs

Paper
Add Code

Distilling Object Detectors via Decoupled Features

1 code implementation • CVPR 2021 • Jianyuan Guo, Kai Han, Yunhe Wang, Han Wu, Xinghao Chen, Chunjing Xu, Chang Xu

To this end, we present a novel distillation algorithm via decoupled features (DeFeat) for learning a better student detector.

Image Classification Knowledge Distillation +3

109

Paper
Code

Transformer in Transformer

12 code implementations • NeurIPS 2021 • Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang

In this paper, we point out that the attention inside these local patches are also essential for building visual transformers with high performance and we explore a new architecture, namely, Transformer iN Transformer (TNT).

Ranked #9 on Fine-Grained Image Classification on Oxford-IIIT Pet Dataset

Fine-Grained Image Classification Sentence

29,842

Paper
Code

A Survey on Visual Transformer

no code implementations • 23 Dec 2020 • Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.

Image Classification Inductive Bias

Paper
Add Code

HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens

6 code implementations • CVPR 2021 • Zhaohui Yang, Yunhe Wang, Xinghao Chen, Jianyuan Guo, Wei zhang, Chao Xu, Chunjing Xu, DaCheng Tao, Chang Xu

To achieve an extremely fast NAS while preserving the high accuracy, we propose to identify the vital blocks and make them the priority in the architecture search.

Neural Architecture Search

334

Paper
Code

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

1 code implementation • CVPR 2020 • Jianyuan Guo, Kai Han, Yunhe Wang, Chao Zhang, Zhaohui Yang, Han Wu, Xinghao Chen, Chang Xu

To this end, we propose a hierarchical trinity search framework to simultaneously discover efficient architectures for all components (i. e. backbone, neck, and head) of object detector in an end-to-end manner.

Image Classification Neural Architecture Search +3

114

Paper
Code

GhostNet: More Features from Cheap Operations

34 code implementations • CVPR 2020 • Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu

Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources.

Ranked #867 on Image Classification on ImageNet

Image Classification

29,846

Paper
Code

Beyond Human Parts: Dual Part-Aligned Representations for Person Re-Identification

1 code implementation • ICCV 2019 • Jianyuan Guo, Yuhui Yuan, Lang Huang, Chao Zhang, Jinge Yao, Kai Han

On the other hand, there still exist many useful contextual cues that do not fall into the scope of predefined human parts or attributes.

Ranked #59 on Person Re-Identification on DukeMTMC-reID

Human Parsing Person Re-Identification

Paper
Code

Interlaced Sparse Self-Attention for Semantic Segmentation

6 code implementations • 29 Jul 2019 • Lang Huang, Yuhui Yuan, Jianyuan Guo, Chao Zhang, Xilin Chen, Jingdong Wang

There are two successive attention modules each estimating a sparse affinity matrix.

Segmentation Semantic Segmentation

8,267

Paper
Code

Attribute-Aware Attention Model for Fine-grained Representation Learning

1 code implementation • 2 Jan 2019 • Kai Han, Jianyuan Guo, Chao Zhang, Mingjian Zhu

Based on the considerations above, we propose a novel Attribute-Aware Attention Model ($A^3M$), which can learn local attribute representation and global category representation simultaneously in an end-to-end manner.

Ranked #4 on Fine-Grained Image Classification on CompCars

Attribute Fine-Grained Image Classification +4

156

Paper
Code

OCNet: Object Context Network for Scene Parsing

8 code implementations • 4 Sep 2018 • Yuhui Yuan, Lang Huang, Jianyuan Guo, Chao Zhang, Xilin Chen, Jingdong Wang

To capture richer context information, we further combine our interlaced sparse self-attention scheme with the conventional multi-scale context schemes including pyramid pooling~\citep{zhao2017pyramid} and atrous spatial pyramid pooling~\citep{chen2018deeplab}.

Ranked #9 on Semantic Segmentation on Trans10K

Object Relation +2

7,431

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.