no code implementations • 22 Mar 2017 • Xushen Han, Dajiang Zhou, Shihao Wang, Shinji Kimura
Under limited DRAM bandwidth, a system throughput of 1244GFlop/s is achieved at the Vertex UltraScale platform, which is 5. 48 times higher than the state-of-the-art FPGA implementations.
no code implementations • 4 Mar 2017 • Shihao Wang, Dajiang Zhou, Xushen Han, Takeshi Yoshimura
This achieves a peak throughput of 806. 4GOPS with 567. 5mW and is able to accelerate the five convolutional layers in AlexNet at a frame rate of 326. 2fps.