HRNet

Last updated on Feb 14, 2021

hrnet_w18

Parameters 21 Million
FLOPs 6 Billion
File Size 81.75 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w18_small

Parameters 13 Million
FLOPs 2 Billion
File Size 50.48 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18_small
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w18_small_v2

Parameters 16 Million
FLOPs 3 Billion
File Size 59.78 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w18_small_v2
Epochs 100
Layers 18
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w30

Parameters 38 Million
FLOPs 10 Billion
File Size 144.44 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w30
Epochs 100
Layers 30
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w32

Parameters 41 Million
FLOPs 12 Billion
File Size 157.88 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time 60 hours

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w32
Epochs 100
Layers 32
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w40

Parameters 58 Million
FLOPs 16 Billion
File Size 220.20 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w40
Epochs 100
Layers 40
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w44

Parameters 67 Million
FLOPs 19 Billion
File Size 256.50 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w44
Epochs 100
Layers 44
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w48

Parameters 77 Million
FLOPs 22 Billion
File Size 296.21 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time 80 hours

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w48
Epochs 100
Layers 48
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
hrnet_w64

Parameters 128 Million
FLOPs 37 Billion
File Size 489.30 MB
Training Data <h2>oi</h2>
Training Resources 4x NVIDIA V100 GPUs
Training Time

Training Techniques Nesterov Accelerated Gradient, Weight Decay
Architecture Batch Normalization, Convolution, ReLU, Residual Connection
ID hrnet_w64
Epochs 100
Layers 64
Crop Pct 0.875
Momentum 0.9
Batch Size 256
Image Size 224
Weight Decay 0.001
Interpolation bilinear
SHOW MORE
SHOW LESS
README.md

Summary

HRNet, or High-Resolution Net, is a general purpose convolutional neural network for tasks like semantic segmentation, object detection and image classification. It is able to maintain high resolution representations through the whole process. We start from a high-resolution convolution stream, gradually add high-to-low resolution convolution streams one by one, and connect the multi-resolution streams in parallel. The resulting network consists of several ($4$ in the paper) stages and the $n$th stage contains $n$ streams corresponding to $n$ resolutions. The authors conduct repeated multi-resolution fusions by exchanging the information across the parallel streams over and over.

How do I load this model?

To load a pretrained model:

import timm
m = timm.create_model('hrnet_w30', pretrained=True)
m.eval()

Replace the model name with the variant you want to use, e.g. hrnet_w30. You can find the IDs in the model summaries at the top of this page.

How do I train this model?

You can follow the timm recipe scripts for training a new model afresh.

Citation

@misc{sun2019highresolution,
      title={High-Resolution Representations for Labeling Pixels and Regions}, 
      author={Ke Sun and Yang Zhao and Borui Jiang and Tianheng Cheng and Bin Xiao and Dong Liu and Yadong Mu and Xinggang Wang and Wenyu Liu and Jingdong Wang},
      year={2019},
      eprint={1904.04514},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Results

Image Classification on ImageNet

Image Classification on ImageNet
MODEL TOP 1 ACCURACY TOP 5 ACCURACY
hrnet_w64 79.46% 94.65%
hrnet_w48 79.32% 94.51%
hrnet_w40 78.93% 94.48%
hrnet_w44 78.89% 94.37%
hrnet_w32 78.45% 94.19%
hrnet_w30 78.21% 94.22%
hrnet_w18 76.76% 93.44%
hrnet_w18_small_v2 75.11% 92.41%
hrnet_w18_small 72.34% 90.68%