Search Results for author: Andrei Tomut

Found 1 papers, 0 papers with code

CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks

no code implementations • 25 Jan 2024 • Andrei Tomut, Saeed S. Jahromi, Abhijoy Sarkar, Uygar Kurt, Sukhbinder Singh, Faysal Ishtiaq, Cesar Muñoz, Prabdeep Singh Bajaj, Ali Elborady, Gianni Del Bimbo, Mehrazin Alizadeh, David Montero, Pablo Martin-Ramiro, Muhammad Ibrahim, Oussama Tahiri Alaoui, John Malcolm, Samuel Mugel, Roman Orus

Traditional compression methods such as pruning, distillation, and low-rank approximation focus on reducing the effective number of neurons in the network, while quantization focuses on reducing the numerical precision of individual weights to reduce the model size while keeping the number of neurons fixed.

Model Compression Quantization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.