no code implementations • 1 Oct 2021 • Guin Gilman, Robert J. Walls
We investigate the performance of the concurrency mechanisms available on NVIDIA's new Ampere GPU microarchitecture under deep learning training and inference workloads.
no code implementations • 30 Apr 2021 • Jean-Baptiste Truong, William Gallagher, Tian Guo, Robert J. Walls
This study identifies and proposes techniques to alleviate two key bottlenecks to executing deep neural networks in trusted execution environments (TEEs): page thrashing during the execution of convolutional layers and the decryption of large weight matrices in fully-connected layers.
2 code implementations • CVPR 2021 • Jean-Baptiste Truong, Pratyush Maini, Robert J. Walls, Nicolas Papernot
Current model extraction attacks assume that the adversary has access to a surrogate dataset with characteristics similar to the proprietary data used to train the victim model.
1 code implementation • 7 Apr 2020 • Shijian Li, Robert J. Walls, Tian Guo
However, it is challenging to determine the appropriate cluster configuration---e. g., server type and number---for different training workloads while balancing the trade-offs in training time, cost, and model accuracy.
no code implementations • 28 Aug 2019 • Peter M. VanNostrand, Ioannis Kyriazis, Michelle Cheng, Tian Guo, Robert J. Walls
Performing deep learning on end-user devices provides fast offline inference results and can help protect the user's privacy.
no code implementations • 28 Feb 2019 • Shijian Li, Robert J. Walls, Lijie Xu, Tian Guo
Distributed training frameworks, like TensorFlow, have been proposed as a means to reduce the training time of deep learning models by using a cluster of GPU servers.