no code implementations • 19 Jan 2024 • Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig, Yaman Umuroglu
Recent studies show that also reducing the precision of the accumulator can further improve hardware efficiency at the risk of numerical overflow, which introduces arithmetic errors that can degrade model accuracy.
no code implementations • ICCV 2023 • Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig
We apply our method to deep learning-based computer vision tasks to show that A2Q can train QNNs for low-precision accumulators while maintaining model accuracy competitive with a floating-point baseline.
no code implementations • 31 Jan 2023 • Ian Colbert, Alessandro Pappalardo, Jakoba Petri-Koenig
Across all of our benchmark models trained with 8-bit weights and activations, we observe that constraining the hidden layers of quantized neural networks to fit into 16-bit accumulators yields an average 98. 2% sparsity with an estimated compression rate of 46. 5x all while maintaining 99. 2% of the floating-point performance.
no code implementations • 14 Sep 2022 • Alexander Cann, Ian Colbert, Ihab Amer
The widespread adoption of deep neural networks in computer vision applications has brought forth a significant interest in adversarial robustness.
no code implementations • 10 Mar 2022 • Ian Colbert, Mehdi Saeedi
Recent advancements in deep reinforcement learning have brought forth an impressive display of highly skilled artificial agents capable of complex intelligent behavior.
no code implementations • 23 Nov 2021 • Ian Colbert, Jake Daly, Norm Rubin
GPU compilers are complex software programs with many optimizations specific to target hardware.
2 code implementations • 15 Oct 2021 • Xinyu Zhang, Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das
State-of-the-art quantization techniques are currently applied to both the weights and activations; however, pruning is most often applied to only the weights of the network.
1 code implementation • 15 Jul 2021 • Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das
We analyze and compare the inference properties of convolution-based upsampling algorithms using a quantitative model of incurred time and energy costs and show that using deconvolution for inference at the edge improves both system latency and energy efficiency when compared to their sub-pixel or resize convolution counterparts.
no code implementations • 31 Jan 2021 • Siqiao Ruan, Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das
The use of Deep Learning hardware algorithms for embedded applications is characterized by challenges such as constraints on device power consumption, availability of labeled data, and limited internet bandwidth for frequent training on cloud servers.
no code implementations • 28 Oct 2019 • Alexander Potapov, Ian Colbert, Ken Kreutz-Delgado, Alexander Cloninger, Srinjoy Das
Stochastic-sampling-based Generative Neural Networks, such as Restricted Boltzmann Machines and Generative Adversarial Networks, are now used for applications such as denoising, image occlusion removal, pattern completion, and motion synthesis.
no code implementations • 11 Mar 2019 • Ian Colbert, Ken Kreutz-Delgado, Srinjoy Das
The power budget for embedded hardware implementations of Deep Learning algorithms can be extremely tight.