no code implementations • 19 Mar 2024 • Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen, Jun Zhu
Moreover, for a standard transformer block, our method offers an end-to-end training speedup of 1. 42x and a 1. 49x memory reduction compared to the FP16 baseline.
1 code implementation • 26 Jan 2024 • Yifeng Liu, Hanwen Xu, Tangqi Fang, Haocheng Xi, Zixuan Liu, Sheng Zhang, Hoifung Poon, Sheng Wang
As a fundamental task in computational chemistry, retrosynthesis prediction aims to identify a set of reactants to synthesize a target molecule.
1 code implementation • NeurIPS 2023 • Haocheng Xi, Changhao Li, Jianfei Chen, Jun Zhu
To achieve this, we carefully analyze the specific structures of activation and gradients in transformers to propose dedicated quantizers for them.