Search Results for author: Mahdi Nazemi

Found 15 papers, 4 papers with code

Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy

no code implementations • 8 Feb 2024 • Seyedarmin Azizi, Mahdi Nazemi, Massoud Pedram

This paper addresses this memory limitation by introducing an activation-aware model compression methodology that uses selective low-rank weight tensor approximations of different layers to reduce the parameter count of ViTs.

Model Compression

Paper
Add Code

Low-Precision Mixed-Computation Models for Inference on Edge

no code implementations • 3 Dec 2023 • Seyedarmin Azizi, Mahdi Nazemi, Mehdi Kamal, Massoud Pedram

This paper presents a mixed-computation neural network processing approach for edge applications that incorporates low-precision (low-width) Posit and low-precision fixed point (FixP) number systems.

Quantization

Paper
Add Code

Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation

no code implementations • 12 Aug 2023 • Seyedarmin Azizi, Mahdi Nazemi, Arash Fayyazi, Massoud Pedram

As a result, our proposed method represents a leap forward in neural network design optimization, paving the way for quick model design and implementation in settings with limited resources, thereby propelling the potential of scalable deep learning solutions.

Quantization

Paper
Add Code

A Fast Training-Free Compression Framework for Vision Transformers

1 code implementation • 4 Mar 2023 • Jung Hwan Heo, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Token pruning has emerged as an effective solution to speed up the inference of large Transformer models.

Paper
Code

Efficient Compilation and Mapping of Fixed Function Combinational Logic onto Digital Signal Processors Targeting Neural Network Inference and Utilizing High-level Synthesis

no code implementations • 30 Jul 2022 • Soheil Nazar Shahsavani, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Recent efforts for improving the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic.

Paper
Add Code

NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

no code implementations • 7 Apr 2021 • Mahdi Nazemi, Arash Fayyazi, Amirhossein Esmaili, Atharva Khare, Soheil Nazar Shahsavani, Massoud Pedram

While there is a large body of research on efficient processing of deep neural networks (DNNs), ultra-low-latency realization of these models for applications with stringent, sub-microsecond latency requirements continues to be an unresolved, challenging problem.

Paper
Add Code

A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

1 code implementation • 3 Nov 2020 • Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram

This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images.

Image Classification Model Compression

Paper
Code

SynergicLearning: Neural Network-Based Feature Extraction for Highly-Accurate Hyperdimensional Learning

no code implementations • 30 Jul 2020 • Mahdi Nazemi, Amirhossein Esmaili, Arash Fayyazi, Massoud Pedram

The proposed hybrid machine learning model has the same level of accuracy (i. e. $\pm$1%) as NNs while achieving at least 10% improvement in accuracy compared to HD learning models.

BIG-bench Machine Learning Computational Efficiency

Paper
Add Code

Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

1 code implementation • 29 Jan 2020 • Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2.

Paper
Code

Modeling Processor Idle Times in MPSoC Platforms to Enable Integrated DPM, DVFS, and Task Scheduling Subject to a Hard Deadline

1 code implementation • 19 Dec 2018 • Amirhossein Esmaili, Mahdi Nazemi, Massoud Pedram

Energy efficiency is one of the most critical design criteria for modern embedded systems such as multiprocessor system-on-chips (MPSoCs).

Operating Systems Distributed, Parallel, and Cluster Computing

Paper
Code

NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference

no code implementations • 23 Jul 2018 • Mahdi Nazemi, Ghasem Pasandi, Massoud Pedram

Deep neural networks have been successfully deployed in a wide variety of applications including computer vision and speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

no code implementations • 3 Jun 2018 • Mahdi Nazemi, Massoud Pedram

Lop allows researchers and designers to quickly compare quality of their models using various data representations and arithmetic operations in Python and contrast the hardware cost of viable representations by synthesizing them on their target platforms (e. g., FPGA or ASIC).

BIG-bench Machine Learning

Paper
Add Code

A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA

no code implementations • 11 Jan 2018 • Mahdi Nazemi, Amir Erfan Eshratifar, Massoud Pedram

With ever-increasing application of machine learning models in various domains such as image classification, speech recognition and synthesis, and health care, designing efficient hardware for these models has gained a lot of popularity.

BIG-bench Machine Learning Dimensionality Reduction +4

Paper
Add Code

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations • 13 Dec 2017 • Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

Paper
Add Code

High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis

no code implementations • 6 Jul 2017 • Mahdi Nazemi, Shahin Nazarian, Massoud Pedram

Independent Component Analysis (ICA) is a dimensionality reduction technique that can boost efficiency of machine learning models that deal with probability density functions, e. g. Bayesian neural networks.

BIG-bench Machine Learning Dimensionality Reduction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.