1 code implementation • 4 Nov 2022 • Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve1, Jose M. Alvarez
Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy.
2 code implementations • 16 Apr 2021 • Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius
We present the design and behavior of Sparse Tensor Cores, which exploit a 2:4 (50%) sparsity pattern that leads to twice the math throughput of dense matrix units.