Search Results for author: Mohamed M. Sabry Aly

Found 6 papers, 0 papers with code

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

no code implementations9 May 2024 Xue Geng, Zhe Wang, Chunyun Chen, Qing Xu, Kaixin Xu, Chao Jin, Manas Gupta, Xulei Yang, Zhenghua Chen, Mohamed M. Sabry Aly, Jie Lin, Min Wu, XiaoLi Li

To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning.

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

no code implementations23 May 2022 Peng Hu, Xi Peng, Hongyuan Zhu, Mohamed M. Sabry Aly, Jie Lin

Numerous network compression methods such as pruning and quantization are proposed to reduce the model size significantly, of which the key is to find suitable compression allocation (e. g., pruning sparsity and quantization codebook) of each layer.

Quantization

Delving into Channels: Exploring Hyperparameter Space of Channel Bit Widths with Linear Complexity

no code implementations29 Sep 2021 Zhe Wang, Jie Lin, Xue Geng, Mohamed M. Sabry Aly, Vijay Chandrasekhar

We formulate the quantization of deep neural networks as a rate-distortion optimization problem, and present an ultra-fast algorithm to search the bit allocation of channels.

Quantization

PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery

no code implementations CVPR 2021 Tianyi Zhang, Jie Lin, Peng Hu, Bin Zhao, Mohamed M. Sabry Aly

Unlike convolutions which are inherently parallel, the de-facto standard for NMS, namely GreedyNMS, cannot be easily parallelized and thus could be the performance bottleneck in convolutional object detection pipelines.

object-detection Object Detection

Towards Effective 2-bit Quantization: Pareto-optimal Bit Allocation for Deep CNNs Compression

no code implementations25 Sep 2019 Zhe Wang, Jie Lin, Mohamed M. Sabry Aly, Sean I Young, Vijay Chandrasekhar, Bernd Girod

In this paper, we address an important problem of how to optimize the bit allocation of weights and activations for deep CNNs compression.

Quantization

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks

no code implementations4 Jan 2019 Xue Geng, Jie Fu, Bin Zhao, Jie Lin, Mohamed M. Sabry Aly, Christopher Pal, Vijay Chandrasekhar

This paper addresses a challenging problem - how to reduce energy consumption without incurring performance drop when deploying deep neural networks (DNNs) at the inference stage.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.