Summary
Neural networks can be fooled not just by spreading small changes across all inputs, but by concentrating larger modifications on a carefully selected few. While gradient-based attacks like FGSM and DeepFool focus on minimizing perturbation magnitude under various norms, they allow all features to change. In many real-world scenarios, the constraint that matters most is not how much each feature changes, but how many features can be modified at all. Sparse attacks address this by changing as few input dimensions as possible while maintaining attack effectiveness.
The sparsity budget is measured by the L0 pseudo-norm, which counts the number of coordinates that differ between adversarial and original inputs. This module provides a comprehensive exploration of techniques that generate adversarial examples under strict sparsity constraints:
- Mathematical foundations of sparsity-constrained optimization, including L0 budgets, L1-induced sparsity, and saliency-based feature selection.
-
ElasticNet Attack (EAD)
, which combines L1 and L2 regularization to produce perturbations that are simultaneously sparse (few modified pixels) and smooth (bounded individual changes). - FISTA optimization for solving the non-smooth ElasticNet objective with proximal gradient descent and momentum acceleration.
-
Jacobian-based Saliency Map Attack (JSMA)
, which enforces explicit L0 budgets by iteratively modifying one or two features per step based on gradient-derived saliency scores. - Single-pixel and pairwise JSMA variants that balance attack efficiency with modification counts.
This module is broken into sections with hands-on exercises for each attack method. It concludes with a practical skills assessment to validate your understanding.
You can start and stop at any time and resume where you left off. There is no time limit or grading, but you must complete all exercises and the skills assessment to receive the maximum cubes and have the module marked as complete in any selected paths.
To ensure a smooth learning experience, the following skills are mandatory: solid Python
proficiency, familiarity with Jupyter Notebooks
, and understanding of neural networks, gradient computation, and optimization methods.
A firm grasp of these modules is recommended before starting:
- Fundamentals of AI
- Applications of AI in InfoSec
- Introduction to Red Teaming AI
- Prompt Injection Attacks
- AI Data Attacks
- AI Evasion Foundations
- AI Evasion - First-Order Attacks
It is HIGHLY recommended to use your own PC/Laptop for the practicals.
Introduction to Sparsity Evasion Attacks
Sparsity attacks seek misclassification by changing as few input dimensions as possible. The sparsity budget is measured by the pseudo‑norm, defined as the number of coordinates that differ between an adversarial input and the original,
Instead of spreading small changes over many features, these attacks concentrate edits on a small set of high‑impact features. This section extends the first‑order perspective to settings where the primary constraint is how many features may change, not how small each change must be.
From First‑Order to Sparsity
The previous module used gradients to move an input across a decision boundary under or limits. Those norms penalize the size of a perturbation but allow all features to move. In many systems, the attack surface is discrete or partially discrete, for example pixels that can saturate to bounds or tokens that change one at a time, so controlling the number of edited features is the relevant constraint. Sparsity attacks keep the feature count small, which preserves most of the input unchanged and can evade simple anomaly detectors that focus on global noise levels.
Threat Model and Budgets
We consider inference‑time attackers who can compute or approximate
gradients. In a white‑box
setting the attacker evaluates
derivatives through the model and uses them to select which features to
edit. In a black‑box
setting the attacker estimates
importance scores by queries, or transfers sparse patterns from a
surrogate. The primary budget is
,
sometimes with auxiliary limits on
or
to keep edits bounded and valid. Inputs remain in [0,1]
for
images after each update, and if the model uses normalization
,
gradients propagate through it by the chain rule, so reasoning in pixel
space remains correct while respecting box constraints.
Two Paths to Sparse Perturbations
ElasticNet (EAD)
promotes sparsity by adding an
penalty to the optimization. The
term encourages many coordinates of the perturbation to be exactly zero,
which approximates an
goal while remaining continuous. A common objective is
where is a loss that enforces misclassification (often targeted), balances attack success with compactness, and controls sparsity through the term. The result is a small set of larger edits rather than many tiny ones.
Jacobian‑based Saliency Map Attack (JSMA)
enforces an
explicit
budget by modifying one or two features per iteration using a
saliency map
derived from the input Jacobian to score
candidates that raise the target while suppressing competitors.
Why Sparsity Attacks Matter
Sparse edits align with real constraints. An attacker may only be able to flip a few bits in a binary, touch a handful of pixels due to rendering limits, or change a small number of tokens in text. Sparse perturbations can be harder to detect with defenses tuned to global noise statistics, and they reveal which features the model treats as most decisive. For defenders, reproducing EAD and JSMA establishes baselines for ‑induced sparsity and explicit control, which together expose different failure modes than or attacks.