Search Results for author: Chakshu Moar

Found 1 papers, 0 papers with code

Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models

no code implementations • 10 May 2024 • Chakshu Moar, Michael Pellauer, Hyoukjun Kwon

The results show that low-rank decomposition can be a promising direction for LLM-based applications that require real-time service in scale (e. g., AI agent assist and real-time coding assistant), where the latency is as important as the model accuracy.

Model Compression Navigate +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.