no code implementations • 12 Mar 2024 • Zhanpeng Zeng, Karthikeyan Sankaralingam, Vikas Singh
A popular strategy is the use of low bit-width integers to approximate the original entries in a matrix.
1 code implementation • 12 Mar 2024 • Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas Singh
While GPU clusters are the de facto choice for training large deep neural network (DNN) models today, several reasons including ease of workflow, security and cost have led to efforts investigating whether CPUs may be viable for inference in routine use in many sectors of the industry.
no code implementations • 10 Mar 2024 • Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang, Vikas Singh
If quantization is interpreted as the addition of noise, our casting of the problem allows invoking an extensive body of known consistent recovery and noise robustness guarantees.
1 code implementation • 21 Jul 2022 • Zhanpeng Zeng, Sourav Pal, Jeffery Kline, Glenn M Fung, Vikas Singh
Transformers have emerged as a preferred model for many tasks in natural langugage processing and vision.
1 code implementation • 18 Nov 2021 • Zhanpeng Zeng, Yunyang Xiong, Sathya N. Ravi, Shailesh Acharya, Glenn Fung, Vikas Singh
In this paper, we show that a Bernoulli sampling attention mechanism based on Locality Sensitive Hashing (LSH), decreases the quadratic complexity of such models to linear.
6 code implementations • 7 Feb 2021 • Yunyang Xiong, Zhanpeng Zeng, Rudrasis Chakraborty, Mingxing Tan, Glenn Fung, Yin Li, Vikas Singh
The scalability of Nystr\"{o}mformer enables application to longer sequences with thousands of tokens.
Ranked #13 on Semantic Textual Similarity on MRPC (F1 metric)