Red Specter ARCHITECT — AI Infrastructure Exploitation Framework

The Blind Spot

Your Model Is Secure. Your Infrastructure Isn't.

Every AI security tool on the market tests the model. Prompt injection. Jailbreaks. Alignment failures. Nobody tests the infrastructure the model runs on. SageMaker notebooks with public internet access. Kubernetes pods running privileged with host PID sharing. GPU memory that retains previous tenant data. CI/CD pipelines with unsigned model artifacts. Cloud metadata endpoints handing out IAM credentials to any container that asks. ARCHITECT fills that gap.

Cloud AI Is Misconfigured

SageMaker execution roles with s3:* and iam:* permissions. Vertex AI prediction endpoints with no authentication. Azure ML workspaces exposed to the public internet. Model artifact buckets with public ACLs. Every major cloud provider's AI services ship with insecure defaults that nobody audits.

Kubernetes Runs AI Privileged

AI workloads get privileged containers because GPU drivers "need" it. Host PID namespaces shared for profiling. No network policies on ML namespaces. RBAC with wildcard permissions because the data scientist asked for it. Model artifact volumes mounted read-write to inference pods. One container escape owns the entire cluster.

GPU Memory Is Not Isolated

Multi-tenant GPU sharing without MIG isolation. GPU memory from the previous tenant's training run still resident and recoverable. NVIDIA profiling APIs exposed. CUDA IPC handles shared between containers on the same GPU. Your competitor's model weights are sitting in GPU memory from the last job.

ML Pipelines Are Supply Chain Attacks

Training containers pull :latest tags. Model artifacts deployed without cryptographic signatures. Training data downloaded without hash verification. Secrets visible in pipeline logs. Dependencies unpinned. One poisoned base image and every model trained since is compromised.

Model Serving Leaks Everything

Inference endpoints with no authentication. Server headers broadcasting "TorchServe 0.9.0". Model metadata endpoints revealing architecture and parameter counts. CORS wildcard on prediction APIs. Debug endpoints returning stack traces with file paths. Model weights downloadable via /models/download.

Metadata Services Hand Out Keys

AWS IMDSv1 still enabled on AI workloads. GCP metadata tokens accessible from any container. Azure managed identity tokens exposed to every pod. Kubernetes service account tokens auto-mounted. Docker sockets accessible from ML containers. One HTTP GET to 169.254.169.254 and you have production credentials.

7 Subsystems

Infrastructure Attack Surface — Mapped

ARCHITECT decomposes AI infrastructure into seven distinct attack surfaces. Each subsystem has its own engine, test vectors, analysis functions, and evidence collection. Every finding gets a severity rating, risk score, grade, remediation guidance, and subsystem attribution.

Subsystem 01

CLOUD

10 vectors — AWS / GCP / Azure / Generic

Cloud AI service security testing. IAM misconfiguration detection, endpoint exposure analysis, credential leakage scanning, storage exposure checks, network exposure assessment, and audit logging gap identification. Targets SageMaker, Vertex AI, Azure ML, and Bedrock.

Subsystem 02

KUBE

10 vectors — RBAC / Escape / GPU / Network

Kubernetes AI workload exploitation. RBAC wildcard detection, container escape path analysis (privileged, hostPID, hostNetwork), GPU device isolation checks, model volume writability testing, namespace isolation verification, network policy auditing, and secret management review.

Subsystem 03

GPU

6 vectors — Isolation / Memory / Drivers

GPU node security assessment. MIG isolation verification, GPU memory clearing between jobs, NVIDIA driver version security checks (minimum 535.x), compute quota enforcement, profiling API exposure (nsight, nvprof), and CUDA IPC handle sharing detection.

Subsystem 04

PIPELINE

6 vectors — Signing / Supply Chain / Secrets

CI/CD pipeline security for AI builds. Unsigned model artifact detection, mutable base image tag scanning (:latest, :main), secret exposure in pipeline logs, training data hash verification, untrusted trigger source analysis, and dependency pinning checks.

Subsystem 05

MODELSERVE

8 vectors — Auth / Extraction / Info Leak

Model serving endpoint exploitation. Unauthenticated inference detection, metadata endpoint exposure, health endpoint information leakage, rate limiting bypass for model extraction, debug endpoint discovery, CORS wildcard assessment, model weight download testing, and verbose error message analysis.

Subsystem 06

DATALEAK

5 vectors — Encryption / Access / Integrity

Training data security assessment. Public access detection on S3, GCS, Azure Blob, NFS, and local stores. Encryption-at-rest verification, encryption-in-transit enforcement, access logging audit, and versioning checks for data poisoning recovery.

Subsystem 07

METADATA

6 vectors — IMDS / Tokens / Escape

Cloud metadata endpoint exploitation from AI workloads. AWS IMDSv1/v2, GCP metadata service, Azure IMDS, Kubernetes service account token, and Docker socket testing. Credential harvesting detection for AccessKeyId, SecretAccessKey, OAuth tokens, and JWTs.

Subsystem 01 — CLOUD

Cloud AI Service Security Vectors

10 test vectors across AWS, GCP, Azure, and generic cloud AI services. Six attack categories: IAM misconfiguration, endpoint exposure, credential leakage, storage exposure, network exposure, and logging gaps.

ID	Vector	Severity	Description
ACLD-001	SageMaker Notebook Public Access	CRITICAL	Check if SageMaker notebook instances are accessible from the internet via DirectInternetAccess setting
ACLD-002	SageMaker Execution Role Overpermission	CRITICAL	Analyse IAM policies attached to SageMaker execution roles for excessive permissions (s3:, iam:)
ACLD-003	Model Artifact S3 Bucket Public	CRITICAL	Check bucket ACL and policy for public access, verify Block Public Access settings on model artifact storage
ACLD-004	Vertex AI Endpoint Authentication	CRITICAL	Attempt unauthenticated requests to Vertex AI prediction endpoints. Check for allUsers/allAuthenticatedUsers
ACLD-005	Azure ML Workspace Network Isolation	HIGH	Check if Azure ML workspace is deployed with private endpoints or publicly accessible
ACLD-006	Training Data Bucket Versioning	MEDIUM	Check if training data storage has versioning enabled for data poisoning recovery
ACLD-007	Model Registry Access Logging	HIGH	Verify audit logging on model registry operations — upload, download, and deletion
ACLD-008	Cloud Credential in Env Vars	CRITICAL	Inspect environment variables of running AI containers for credential patterns (AKIA*, service_account, connection strings)
ACLD-009	Inference Endpoint Rate Limiting	MEDIUM	Send burst requests to inference endpoints, check for rate limiting response to prevent model extraction
ACLD-010	Cross-Account Model Access	HIGH	Check S3 bucket policies and SageMaker endpoint policies for cross-account access to model artifacts

Subsystems 02-03 — KUBE + GPU

Kubernetes and GPU Attack Vectors

16 test vectors targeting Kubernetes AI workloads and GPU node security. Pod spec analysis detects privileged containers, host namespace sharing, writable model volumes, root execution, and missing resource limits in real time.

ID	Vector	Severity	Description
AKUB-001	Privileged AI Container	CRITICAL	AI workload container running with privileged flag — enables full container escape to host
AKUB-002	Host PID Namespace Shared	CRITICAL	AI workload shares host PID namespace — can see and signal all host processes
AKUB-004	GPU Device Without Isolation	HIGH	GPU device mounted directly without MIG or MPS isolation — co-tenant data leak risk
AKUB-005	Model Volume Writable	HIGH	Model artifacts volume mounted read-write — inference workloads can modify deployed models
AKUB-006	RBAC Wildcard Permissions	CRITICAL	Service account has wildcard verbs or resources in RBAC role — full cluster access
AKUB-007	No Network Policy on AI Namespace	HIGH	AI workload namespace has no NetworkPolicy — unrestricted pod-to-pod traffic
AKUB-010	Cross-Namespace Model Access	HIGH	AI workload can access resources in other namespaces — isolation breach
AGPU-001	No MIG Isolation on Shared GPU	CRITICAL	Multi-Instance GPU not enabled on shared nodes — co-tenant GPU memory visible
AGPU-002	GPU Memory Not Cleared	CRITICAL	GPU memory retains data from previous tenant — model weights or training data recoverable
AGPU-003	NVIDIA Driver Outdated	HIGH	GPU driver below minimum secure version 535.x — known escape or privilege escalation CVEs
AGPU-005	GPU Profiling API Exposed	HIGH	NVIDIA profiling tools (nsight, nvprof) accessible — can observe other workload behaviour
AGPU-006	CUDA IPC Handles Shared	HIGH	CUDA Inter-Process Communication handles shared between containers on same GPU

Subsystems 04-07 — Pipeline / Serve / Data / Metadata

Supply Chain, Serving, Data, and Metadata Vectors

25 test vectors covering the remaining attack surface: CI/CD pipeline integrity, model serving endpoints, training data stores, and cloud metadata services. Each vector maps to real-world techniques observed in cloud AI deployments.

ID	Vector	Severity	Description
APIP-001	Unsigned Model Artifacts	CRITICAL	Model artifacts not cryptographically signed — cannot verify integrity post-build. No cosign/sigstore detected.
APIP-002	Mutable Base Images	HIGH	Training containers use :latest or mutable tags — vulnerable to supply chain injection
APIP-003	Secrets in Pipeline Logs	CRITICAL	API keys, tokens, or credentials visible in CI/CD pipeline output logs
APIP-004	No Hash Verification on Training Data	HIGH	Training data downloaded without SHA-256 hash verification — poisoning possible
ASRV-001	Inference Endpoint Without Auth	CRITICAL	Model inference endpoint at /v1/predict accepts unauthenticated requests
ASRV-005	Debug Endpoint Exposed	CRITICAL	Debug/profiling endpoint at /debug accessible — reveals internal model state and configuration
ASRV-007	Model Weights Downloadable	CRITICAL	Model weight files directly downloadable via /models/download serving endpoint
ADLK-001	Training Data Publicly Accessible	CRITICAL	Training data store (S3/GCS/Azure Blob) allows public/unauthenticated access
ADLK-002	No Encryption at Rest	HIGH	Training data stored without encryption — physical or cloud-level access exposes all data
AMET-001	AWS IMDSv1 Accessible	CRITICAL	AWS instance metadata at 169.254.169.254 exposes IAM credentials without token requirement
AMET-003	GCP Metadata Token Harvest	CRITICAL	GCP metadata service at metadata.google.internal exposes service account tokens
AMET-005	Kubernetes Service Account Token	CRITICAL	Kubernetes API accessible via default service account token at /var/run/secrets/
AMET-006	Docker Socket Accessible	CRITICAL	Docker daemon API at 127.0.0.1:2375 — full container escape via docker.sock

Command Line

CLI-First. No GUI. No Web Console.

ARCHITECT runs from the terminal. Every command maps directly to a subsystem engine. Target specification, output control, verbose mode, and full-scan orchestration — all from one binary. 8 commands. 7 subsystem engines. 1 full-scan coordinator.

Full Infrastructure Scan

$ architect scan --target https://ml-platform.example.com --output reports/ --verbose

ARCHITECT v1.0.0 | Red Specter Security Research | Engineered by Richard Barron AI Infrastructure Exploitation — your model is secure, your infrastructure isn't. Target: https://ml-platform.example.com Mode: Full scan — all 7 subsystems → cloud 10 vectors IAM / endpoints / credentials / storage → kube 10 vectors RBAC / escape / GPU / network / secrets → gpu 6 vectors isolation / memory / drivers / compute → pipeline 6 vectors signing / supply chain / secrets / triggers → modelserve 8 vectors auth / extraction / info leak / CORS → dataleak 5 vectors encryption / access / integrity / logging → metadata 6 vectors IMDS / tokens / sockets / escape Scan ID: RSA-SCAN-7A3F2B1C9D0E Report: reports/RSA-SCAN-7A3F2B1C9D0E_ARCHITECT_2026-03-26.json Signature: Ed25519 ✓ signed

Single Subsystem — Cloud AI Services

$ architect cloud --target arn:aws:sagemaker:eu-west-1:123456789:notebook-instance/ml-dev --verbose

Target: arn:aws:sagemaker:eu-west-1:123456789:notebook-instance/ml-dev Tests: 10 cloud security vectors ACLD-001 CRITICAL SageMaker notebook public access — DirectInternetAccess: Enabled ACLD-002 CRITICAL SageMaker execution role overpermission — Action: s3:*, iam:* ACLD-003 CRITICAL Model artifact S3 bucket public — Block Public Access: DISABLED ACLD-008 CRITICAL Cloud credential in env vars — AKIA[REDACTED] found ACLD-005 HIGH Azure ML workspace exposed — public endpoint enabled ACLD-007 HIGH Model registry — no audit logging on upload/download ACLD-006 MEDIUM Training data versioning — not enabled ACLD-009 MEDIUM Inference rate limiting — no throttling detected

Single Subsystem — Kubernetes Workloads

$ architect kube --target https://k8s-cluster.internal:6443

Single Subsystem — GPU Node Security

$ architect gpu --target gpu-node-pool-01.internal

Single Subsystem — Metadata Endpoint Testing

$ architect metadata --target http://169.254.169.254 --verbose

Target: http://169.254.169.254 Tests: 6 metadata endpoints AMET-001 CRITICAL AWS IMDSv1 — HTTP 200 (4,096 bytes) — IAM credentials exposed CRITICAL Credential harvested: AWS access key found (AccessKeyId) CRITICAL Credential harvested: AWS secret key found (SecretAccessKey) AMET-002 HIGH AWS IMDSv2 — token endpoint reachable, not enforced AMET-005 CRITICAL Kubernetes service account token mounted at /var/run/secrets/ AMET-006 CRITICAL Docker socket accessible at 127.0.0.1:2375 — container escape

Single Subsystem — Model Serving Endpoints

$ architect modelserve --target https://inference.example.com

Single Subsystem — Training Data Security

$ architect dataleak --target s3://training-data-prod/datasets/

Single Subsystem — CI/CD Pipeline Security

$ architect pipeline --target https://github.com/org/ml-pipeline

Evidence Chain

Cryptographic Evidence. Court-Grade Audit Trail.

Every ARCHITECT scan produces a SHA-256 hash-chained evidence log and an Ed25519 signed report. Each finding includes a unique RSA- finding ID, subsystem attribution, severity score, risk grade, timestamp, payload used, response received, and remediation guidance. The evidence chain is tamper-evident — modify one entry and every subsequent hash breaks.

Ed25519 Signatures

Every report signed with Ed25519 private key. Signature, public key, timestamp, and algorithm stored alongside findings. Verified with verify_signature() using canonical JSON serialisation.

SHA-256 Hash Chain

EvidenceChain class builds a blockchain-style audit trail. Each entry contains index, timestamp, previous_hash, evidence payload, and computed SHA-256 hash. Chain integrity verified with verify() method.

Structured Findings

Every finding: RSA- ID, test name, category, severity (CRITICAL/HIGH/MEDIUM/LOW/INFO), weighted risk score (10.0/7.0/4.0/2.0/0.5), letter grade (A+ through F), subsystem attribution, and UTC timestamp.

Risk Scoring Engine

Weighted severity scoring: CRITICAL=10.0, HIGH=7.0, MEDIUM=4.0, LOW=2.0, INFO=0.5. Grade thresholds from A+ (0.0) through F (94.0+). ScanReport auto-calculates risk score and grade from findings.

JSON Report Export

Full scan reports saved as RSA-SCAN-{id}_{tool}_{date}.json. Contains scan_id, target, timestamp, all findings, risk score, risk grade, duration, config, and severity/subsystem breakdown summary.

UTC Timestamps

Every finding, every evidence entry, every signature — timestamped in ISO 8601 UTC. Full temporal audit trail from scan start to report signing. datetime.now(timezone.utc).isoformat() on every record.

Pipeline Positioning

ARCHITECT Owns Infra. VORTEX Owns Cloud AI. Together They Own Everything.

ARCHITECT targets the infrastructure layer — Kubernetes clusters, GPU nodes, CI/CD pipelines, metadata services, training data stores. VORTEX targets the cloud AI service layer — SageMaker, Vertex AI, Azure ML, Bedrock, OpenAI. Run both and there is no gap between the model and the metal.

Layer 1

VORTEX

Cloud AI service exploitation. API-level attacks on managed AI platforms. Service misuse, quota abuse, cross-tenant isolation testing.

→

Layer 2

ARCHITECT

Infrastructure exploitation. 7 subsystems. Cloud services, Kubernetes, GPU nodes, pipelines, serving, data stores, metadata. The layer below the model.

→

Layer 3

FORGE

Model-level testing. Prompt injection, jailbreak, alignment, extraction. The layer above the infrastructure. Tests the model itself.

                # Full stack AI security assessment

                $ architect scan --target https://ml-platform.example.com     # Infrastructure

                $ vortex scan --target https://ml-platform.example.com        # Cloud AI services

                $ forge scan --target https://api.example.com/v1/chat          # Model layer

SIEM Integration

Every Finding. Every SIEM. Real-Time.

ARCHITECT outputs structured JSON reports with finding IDs, severity levels, subsystem attribution, and timestamps. Every field is designed for SIEM ingestion. Parse RSA- finding IDs, filter by severity, correlate by subsystem, and feed the evidence chain directly into your security operations pipeline.

Splunk

JSON report ingestion via HEC. RSA- finding IDs as event identifiers. Severity field mapping to Splunk alert levels. Subsystem as source type.

Elastic / ELK

Findings indexed as structured documents. Risk score as numeric field for Kibana dashboards. Evidence chain hashes for integrity verification.

Microsoft Sentinel

Azure Log Analytics workspace ingestion. Finding severity maps to Sentinel incident severity. Subsystem breakdown for workbook visualisation.

IBM QRadar

DSM parsing of JSON reports. RSA- IDs as custom event properties. Risk grade thresholds for offense creation rules.

Google Chronicle

UDM entity mapping from scan findings. Cloud subsystem results correlate with Chronicle cloud audit logs for full-stack visibility.

Custom / Webhook

JSON output to stdout, file, or webhook. Pipe architect scan output to any system that accepts structured JSON. No agent required.

                # Pipe findings to SIEM

                $ architect scan --target https://ml-platform.example.com --output - | curl -X POST -H "Content-Type: application/json" -d @- https://siem.example.com/api/events

Safety

UNLEASHED Gate

Standard mode detects and reports. UNLEASHED mode exploits. Three execution tiers controlled by Ed25519 cryptographic gating. Detection runs without keys. Dry-run plans exploitation campaigns with key requirement. Live execution requires cryptographic override on the founder's machine. One operator. One key. Every execution signed and logged.

Detection

Maps AI infrastructure attack surfaces across all 7 subsystems. Identifies vulnerable services, misconfigurations, exposed endpoints, and credential patterns. No exploitation. No modification. Reports only. Runs without UNLEASHED keys.

Dry Run

Plans full infrastructure exploitation campaigns. Shows exactly what ARCHITECT would exploit, which credentials it would harvest, which containers it would escape. Ed25519 key required. RSA-U- finding IDs with [DRY-RUN] prefix. No execution.

Live Execution

Cryptographic override. Ed25519 private key on the founder's machine only. File permissions 0o600. Live credential harvesting, container escape, metadata exploitation. Every action signed and chained. One operator.

THIS TOOL IS FOR AUTHORISED SECURITY TESTING ONLY. EVERY EXECUTION IS SIGNED AND LOGGED.

Available On

Security Distros & Package Managers

ARCHITECT ships as a Python package (Python 3.11+) with cross-platform support. Dependencies: httpx, typer, rich, pydantic, jinja2, cryptography, scipy, numpy. Available on every major security distribution and package manager.

Kali Linux

.deb package

Parrot OS

.deb package

BlackArch

PKGBUILD

REMnux

.deb package

Tsurugi

.deb package

PyPI

pip install

macOS

pip install

Windows

pip install

Docker

docker pull