Efficient Regularization for Adversarially Robustness Deep ReLU Networks

29 Sep 2021 · Charles Jin, Martin Rinard ·

We present a regularization functional for deep neural networks with ReLU activations, and propose regularizers that encourage networks which are smooth not only in their predictions but also their decision boundaries. We evaluate the stability of our networks against the standard set of $\ell_2$ and $\ell_\infty$ norm-bounded adversaries, as well as several recently proposed perception-based adversaries, including spatial, recoloring, JPEG, and a learned neural threat model. Crucially, our models are simultaneously robust against multiple state-of-the-art adversaries, suggesting that the robustness generalizes well to \textit{unseen} adversaries. Furthermore, our techniques do not rely on adversarial training and are thus very efficient, incurring overhead on par with two additional parallel passes through the network. On CIFAR-10, we obtain our results after training for only 4 hours, while the next-best performing baseline requires nearly 25 hours of training. To the best of our knowledge, this work presents the first technique to achieve robustness against adversarial perturbations \textit{without} adversarial training.

PDF Abstract