no code implementations • 8 Feb 2024 • Seyedarmin Azizi, Mahdi Nazemi, Massoud Pedram
This paper addresses this memory limitation by introducing an activation-aware model compression methodology that uses selective low-rank weight tensor approximations of different layers to reduce the parameter count of ViTs.
no code implementations • 3 Dec 2023 • Seyedarmin Azizi, Mahdi Nazemi, Mehdi Kamal, Massoud Pedram
This paper presents a mixed-computation neural network processing approach for edge applications that incorporates low-precision (low-width) Posit and low-precision fixed point (FixP) number systems.
no code implementations • 12 Aug 2023 • Seyedarmin Azizi, Mahdi Nazemi, Arash Fayyazi, Massoud Pedram
As a result, our proposed method represents a leap forward in neural network design optimization, paving the way for quick model design and implementation in settings with limited resources, thereby propelling the potential of scalable deep learning solutions.
1 code implementation • 4 Mar 2023 • Jung Hwan Heo, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram
Token pruning has emerged as an effective solution to speed up the inference of large Transformer models.
no code implementations • 30 Jul 2022 • Soheil Nazar Shahsavani, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram
Recent efforts for improving the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic.
no code implementations • 7 Apr 2021 • Mahdi Nazemi, Arash Fayyazi, Amirhossein Esmaili, Atharva Khare, Soheil Nazar Shahsavani, Massoud Pedram
While there is a large body of research on efficient processing of deep neural networks (DNNs), ultra-low-latency realization of these models for applications with stringent, sub-microsecond latency requirements continues to be an unresolved, challenging problem.
1 code implementation • 3 Nov 2020 • Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram
This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images.
no code implementations • 30 Jul 2020 • Mahdi Nazemi, Amirhossein Esmaili, Arash Fayyazi, Massoud Pedram
The proposed hybrid machine learning model has the same level of accuracy (i. e. $\pm$1%) as NNs while achieving at least 10% improvement in accuracy compared to HD learning models.
1 code implementation • 29 Jan 2020 • Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel
We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2.
1 code implementation • 19 Dec 2018 • Amirhossein Esmaili, Mahdi Nazemi, Massoud Pedram
Energy efficiency is one of the most critical design criteria for modern embedded systems such as multiprocessor system-on-chips (MPSoCs).
Operating Systems Distributed, Parallel, and Cluster Computing
no code implementations • 23 Jul 2018 • Mahdi Nazemi, Ghasem Pasandi, Massoud Pedram
Deep neural networks have been successfully deployed in a wide variety of applications including computer vision and speech recognition.
no code implementations • 3 Jun 2018 • Mahdi Nazemi, Massoud Pedram
Lop allows researchers and designers to quickly compare quality of their models using various data representations and arithmetic operations in Python and contrast the hardware cost of viable representations by synthesizing them on their target platforms (e. g., FPGA or ASIC).
no code implementations • 11 Jan 2018 • Mahdi Nazemi, Amir Erfan Eshratifar, Massoud Pedram
With ever-increasing application of machine learning models in various domains such as image classification, speech recognition and synthesis, and health care, designing efficient hardware for these models has gained a lot of popularity.
no code implementations • 13 Dec 2017 • Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram
The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.
no code implementations • 6 Jul 2017 • Mahdi Nazemi, Shahin Nazarian, Massoud Pedram
Independent Component Analysis (ICA) is a dimensionality reduction technique that can boost efficiency of machine learning models that deal with probability density functions, e. g. Bayesian neural networks.