1 code implementation • NeurIPS 2021 • R David Evans, Tor Aamodt
Parallel hardware devices (e. g., graphics processor units) have limited high-bandwidth memory capacity. This negatively impacts the training of deep neural networks (DNNs) by increasing runtime and/or decreasing accuracy when reducing model and/or batch size to fit this capacity.