no code implementations • 23 May 2024 • Zi Yang, Samridhi Choudhary, Xinfeng Xie, Cao Gao, Siegfried Kunzmann, Zheng Zhang
CoMERA achieves end-to-end rank-adaptive tensor-compressed training via a multi-objective optimization formulation, and improves the training to provide both a high compression ratio and excellent accuracy in the training process.
no code implementations • 7 Dec 2021 • Xinfeng Xie, Prakash Prabhu, Ulysse Beaugnon, Phitchaya Mangpo Phothilimthana, Sudip Roy, Azalia Mirhoseini, Eugene Brevdo, James Laudon, Yanqi Zhou
Partitioning ML graphs for MCMs is particularly hard as the search space grows exponentially with the number of chiplets available and the number of nodes in the neural network.
no code implementations • 11 Mar 2021 • Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie
For general purpose scenarios, lightweight hardware designs for diverse data paths, architectural supports for the SIMT programming model, and end-to-end software optimizations remain challenging.
Hardware Architecture
no code implementations • 26 Sep 2020 • Xiaobing Chen, yuke wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yunji Chen, Yuan Xie
Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs.
Hardware Architecture
no code implementations • 25 Sep 2019 • Peiqi Wang, Yu Ji, Xinfeng Xie, Yongqiang Lyu, Dongsheng Wang, Yuan Xie
Despite the success in model reduction of convolutional neural networks (CNNs), neural network quantization methods have not yet been studied on GANs, which are mainly faced with the issues of both the effectiveness of quantization algorithms and the instability of training GAN models.
no code implementations • 10 Mar 2019 • Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, Yuan Xie
As neural networks continue their reach into nearly every aspect of software operations, the details of those networks become an increasingly sensitive subject.
Cryptography and Security Hardware Architecture
no code implementations • 28 Jan 2019 • Yu Ji, Youyang Zhang, Xinfeng Xie, Shuangchen Li, Peiqi Wang, Xing Hu, Youhui Zhang, Yuan Xie
In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing.
no code implementations • 24 Jan 2019 • Peiqi Wang, Dongsheng Wang, Yu Ji, Xinfeng Xie, Haoxuan Song, XuXin Liu, Yongqiang Lyu, Yuan Xie
The intensive computation and memory requirements of generative adversarial neural networks (GANs) hinder its real-world deployment on edge devices such as smartphones.
no code implementations • NeurIPS 2018 • Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, Yuan Xie
For example, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 (the state-of-the-art result to the best of our knowledge) to 110. 3 with a full precision model in 97. 2, and a ternary GRU from 142 to 113. 5 with a full precision model in 102. 7.