1 code implementation • 9 May 2024 • Zifan He, Zongyue Qin, Neha Prakriya, Yizhou Sun, Jason Cong
With an additional 0. 5% - 2% of parameters, HMT can easily plug in and augment future LLMs to handle long context effectively.
no code implementations • 24 Jun 2023 • Shichang Zhang, Atefeh Sohrabizadeh, Cheng Wan, Zijie Huang, Ziniu Hu, Yewen Wang, Yingyan, Lin, Jason Cong, Yizhou Sun
Graph neural networks (GNNs) are emerging for machine learning research on graph-structured data.
no code implementations • 18 May 2023 • Yunsheng Bai, Atefeh Sohrabizadeh, Zongyue Qin, Ziniu Hu, Yizhou Sun, Jason Cong
In addition, these programs can be compiled and converted into a control data flow graph (CDFG), and the compiler also provides fine-grained alignment between the code tokens and the CDFG nodes.
no code implementations • 9 Dec 2022 • Zhe Chen, Garrett J. Blair, Chengdi Cao, Jim Zhou, Daniel Aharoni, Peyman Golshani, Hugh T. Blair, Jason Cong
Our FPGA implementation enables the real-time calcium image decoding with sub-ms processing latency for closed-loop feedback applications.
no code implementations • 17 Nov 2021 • Atefeh Sohrabizadeh, Yunsheng Bai, Yizhou Sun, Jason Cong
High-level synthesis (HLS) has freed the computer architects from developing their designs in a very low-level language and needing to exactly specify how the data should be transferred in register-level.
no code implementations • 10 Nov 2021 • Atefeh Sohrabizadeh, Yuze Chi, Jason Cong
While there have been many studies on hardware acceleration for deep learning on images, there has been a rather limited focus on accelerating deep learning applications involving graphs.
1 code implementation • 8 Oct 2021 • Linghao Song, Yuze Chi, Jason Cong
In this work, we present PYXIS, a performance dataset for specialized accelerators on sparse data.
1 code implementation • 12 Oct 2020 • Young-kyu Choi, Yuze Chi, Jie Wang, Licheng Guo, Jason Cong
With the recent release of High Bandwidth Memory (HBM) based FPGA boards, developers can now exploit unprecedented external memory bandwidth.
Hardware Architecture
1 code implementation • IEEE Transactions on Medical Imaging 2020 • Meng Li, William Hsu, Xiaodong Xie, Jason Cong, Wen Gao
We combine these two methods and demonstrate their effectiveness on both CNN-based neural networks and WGAN-based neural networks with comprehensive experiments.
5 code implementations • 22 Feb 2020 • Bochen Tan, Jason Cong
In this paper, we construct QUEKO benchmarks for this problem, which have known optimal depth.
Quantum Physics Hardware Architecture
no code implementations • 30 Jul 2018 • Jason Cong, Peng Wei, Cody Hao Yu, Peng Zhang
Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space.
Distributed, Parallel, and Cluster Computing Hardware Architecture
no code implementations • 18 Jul 2018 • Meng Li, Shiwen Shen, Wen Gao, William Hsu, Jason Cong
Computed tomography (CT) is increasingly being used for cancer screening, such as early detection of lung cancer.