Evaluating adversarial robustness amounts to finding the minimum perturbation needed to have an input sample misclassified. The inherent complexity of the underlying optimization requires current gradient-based attacks to be carefully tuned, initialized, and possibly executed for many computationally-demanding iterations, even if specialized to a given perturbation model. In this work, we overcome these limitations by proposing a fast minimum-norm (FMN) attack that works with different p-norm perturbation models (p = 0, 1, 2), is robust to hyperparameter choices, does not require adversarial starting points, and converges within few lightweight steps. It works by iteratively finding the sample misclassified with maximum confidence within an p-norm constraint of size ǫ, while adapting ǫ to minimize the distance of the current sample to the decision boundary. Extensive experiments show that FMN significantly outperforms existing 0, 1, and ∞-norm attacks in terms of perturbation size, convergence speed and computation time, while reporting comparable performances with state-of-the-art 2-norm attacks.
Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints
Roli F.;
2021-01-01
Abstract
Evaluating adversarial robustness amounts to finding the minimum perturbation needed to have an input sample misclassified. The inherent complexity of the underlying optimization requires current gradient-based attacks to be carefully tuned, initialized, and possibly executed for many computationally-demanding iterations, even if specialized to a given perturbation model. In this work, we overcome these limitations by proposing a fast minimum-norm (FMN) attack that works with different p-norm perturbation models (p = 0, 1, 2), is robust to hyperparameter choices, does not require adversarial starting points, and converges within few lightweight steps. It works by iteratively finding the sample misclassified with maximum confidence within an p-norm constraint of size ǫ, while adapting ǫ to minimize the distance of the current sample to the decision boundary. Extensive experiments show that FMN significantly outperforms existing 0, 1, and ∞-norm attacks in terms of perturbation size, convergence speed and computation time, while reporting comparable performances with state-of-the-art 2-norm attacks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.