Your model is secure. Your infrastructure isn't. Cloud AI services, Kubernetes workloads, GPU nodes, CI/CD pipelines, model serving endpoints, training data stores, and cloud metadata services — weaponised for authorised red team engagements.
pip install red-specter-architect
Every AI security tool on the market tests the model. Prompt injection. Jailbreaks. Alignment failures. Nobody tests the infrastructure the model runs on. SageMaker notebooks with public internet access. Kubernetes pods running privileged with host PID sharing. GPU memory that retains previous tenant data. CI/CD pipelines with unsigned model artifacts. Cloud metadata endpoints handing out IAM credentials to any container that asks. ARCHITECT fills that gap.
SageMaker execution roles with s3:* and iam:* permissions. Vertex AI prediction endpoints with no authentication. Azure ML workspaces exposed to the public internet. Model artifact buckets with public ACLs. Every major cloud provider's AI services ship with insecure defaults that nobody audits.
AI workloads get privileged containers because GPU drivers "need" it. Host PID namespaces shared for profiling. No network policies on ML namespaces. RBAC with wildcard permissions because the data scientist asked for it. Model artifact volumes mounted read-write to inference pods. One container escape owns the entire cluster.
Multi-tenant GPU sharing without MIG isolation. GPU memory from the previous tenant's training run still resident and recoverable. NVIDIA profiling APIs exposed. CUDA IPC handles shared between containers on the same GPU. Your competitor's model weights are sitting in GPU memory from the last job.
Training containers pull :latest tags. Model artifacts deployed without cryptographic signatures. Training data downloaded without hash verification. Secrets visible in pipeline logs. Dependencies unpinned. One poisoned base image and every model trained since is compromised.
Inference endpoints with no authentication. Server headers broadcasting "TorchServe 0.9.0". Model metadata endpoints revealing architecture and parameter counts. CORS wildcard on prediction APIs. Debug endpoints returning stack traces with file paths. Model weights downloadable via /models/download.
AWS IMDSv1 still enabled on AI workloads. GCP metadata tokens accessible from any container. Azure managed identity tokens exposed to every pod. Kubernetes service account tokens auto-mounted. Docker sockets accessible from ML containers. One HTTP GET to 169.254.169.254 and you have production credentials.
ARCHITECT decomposes AI infrastructure into seven distinct attack surfaces. Each subsystem has its own engine, test vectors, analysis functions, and evidence collection. Every finding gets a severity rating, risk score, grade, remediation guidance, and subsystem attribution.
Cloud AI service security testing. IAM misconfiguration detection, endpoint exposure analysis, credential leakage scanning, storage exposure checks, network exposure assessment, and audit logging gap identification. Targets SageMaker, Vertex AI, Azure ML, and Bedrock.
Kubernetes AI workload exploitation. RBAC wildcard detection, container escape path analysis (privileged, hostPID, hostNetwork), GPU device isolation checks, model volume writability testing, namespace isolation verification, network policy auditing, and secret management review.
GPU node security assessment. MIG isolation verification, GPU memory clearing between jobs, NVIDIA driver version security checks (minimum 535.x), compute quota enforcement, profiling API exposure (nsight, nvprof), and CUDA IPC handle sharing detection.
CI/CD pipeline security for AI builds. Unsigned model artifact detection, mutable base image tag scanning (:latest, :main), secret exposure in pipeline logs, training data hash verification, untrusted trigger source analysis, and dependency pinning checks.
Model serving endpoint exploitation. Unauthenticated inference detection, metadata endpoint exposure, health endpoint information leakage, rate limiting bypass for model extraction, debug endpoint discovery, CORS wildcard assessment, model weight download testing, and verbose error message analysis.
Training data security assessment. Public access detection on S3, GCS, Azure Blob, NFS, and local stores. Encryption-at-rest verification, encryption-in-transit enforcement, access logging audit, and versioning checks for data poisoning recovery.
Cloud metadata endpoint exploitation from AI workloads. AWS IMDSv1/v2, GCP metadata service, Azure IMDS, Kubernetes service account token, and Docker socket testing. Credential harvesting detection for AccessKeyId, SecretAccessKey, OAuth tokens, and JWTs.
10 test vectors across AWS, GCP, Azure, and generic cloud AI services. Six attack categories: IAM misconfiguration, endpoint exposure, credential leakage, storage exposure, network exposure, and logging gaps.
| ID | Vector | Severity | Description |
|---|---|---|---|
| ACLD-001 | SageMaker Notebook Public Access | CRITICAL | Check if SageMaker notebook instances are accessible from the internet via DirectInternetAccess setting |
| ACLD-002 | SageMaker Execution Role Overpermission | CRITICAL | Analyse IAM policies attached to SageMaker execution roles for excessive permissions (s3:*, iam:*) |
| ACLD-003 | Model Artifact S3 Bucket Public | CRITICAL | Check bucket ACL and policy for public access, verify Block Public Access settings on model artifact storage |
| ACLD-004 | Vertex AI Endpoint Authentication | CRITICAL | Attempt unauthenticated requests to Vertex AI prediction endpoints. Check for allUsers/allAuthenticatedUsers |
| ACLD-005 | Azure ML Workspace Network Isolation | HIGH | Check if Azure ML workspace is deployed with private endpoints or publicly accessible |
| ACLD-006 | Training Data Bucket Versioning | MEDIUM | Check if training data storage has versioning enabled for data poisoning recovery |
| ACLD-007 | Model Registry Access Logging | HIGH | Verify audit logging on model registry operations — upload, download, and deletion |
| ACLD-008 | Cloud Credential in Env Vars | CRITICAL | Inspect environment variables of running AI containers for credential patterns (AKIA*, service_account, connection strings) |
| ACLD-009 | Inference Endpoint Rate Limiting | MEDIUM | Send burst requests to inference endpoints, check for rate limiting response to prevent model extraction |
| ACLD-010 | Cross-Account Model Access | HIGH | Check S3 bucket policies and SageMaker endpoint policies for cross-account access to model artifacts |
16 test vectors targeting Kubernetes AI workloads and GPU node security. Pod spec analysis detects privileged containers, host namespace sharing, writable model volumes, root execution, and missing resource limits in real time.
| ID | Vector | Severity | Description |
|---|---|---|---|
| AKUB-001 | Privileged AI Container | CRITICAL | AI workload container running with privileged flag — enables full container escape to host |
| AKUB-002 | Host PID Namespace Shared | CRITICAL | AI workload shares host PID namespace — can see and signal all host processes |
| AKUB-004 | GPU Device Without Isolation | HIGH | GPU device mounted directly without MIG or MPS isolation — co-tenant data leak risk |
| AKUB-005 | Model Volume Writable | HIGH | Model artifacts volume mounted read-write — inference workloads can modify deployed models |
| AKUB-006 | RBAC Wildcard Permissions | CRITICAL | Service account has wildcard verbs or resources in RBAC role — full cluster access |
| AKUB-007 | No Network Policy on AI Namespace | HIGH | AI workload namespace has no NetworkPolicy — unrestricted pod-to-pod traffic |
| AKUB-010 | Cross-Namespace Model Access | HIGH | AI workload can access resources in other namespaces — isolation breach |
| AGPU-001 | No MIG Isolation on Shared GPU | CRITICAL | Multi-Instance GPU not enabled on shared nodes — co-tenant GPU memory visible |
| AGPU-002 | GPU Memory Not Cleared | CRITICAL | GPU memory retains data from previous tenant — model weights or training data recoverable |
| AGPU-003 | NVIDIA Driver Outdated | HIGH | GPU driver below minimum secure version 535.x — known escape or privilege escalation CVEs |
| AGPU-005 | GPU Profiling API Exposed | HIGH | NVIDIA profiling tools (nsight, nvprof) accessible — can observe other workload behaviour |
| AGPU-006 | CUDA IPC Handles Shared | HIGH | CUDA Inter-Process Communication handles shared between containers on same GPU |
25 test vectors covering the remaining attack surface: CI/CD pipeline integrity, model serving endpoints, training data stores, and cloud metadata services. Each vector maps to real-world techniques observed in cloud AI deployments.
| ID | Vector | Severity | Description |
|---|---|---|---|
| APIP-001 | Unsigned Model Artifacts | CRITICAL | Model artifacts not cryptographically signed — cannot verify integrity post-build. No cosign/sigstore detected. |
| APIP-002 | Mutable Base Images | HIGH | Training containers use :latest or mutable tags — vulnerable to supply chain injection |
| APIP-003 | Secrets in Pipeline Logs | CRITICAL | API keys, tokens, or credentials visible in CI/CD pipeline output logs |
| APIP-004 | No Hash Verification on Training Data | HIGH | Training data downloaded without SHA-256 hash verification — poisoning possible |
| ASRV-001 | Inference Endpoint Without Auth | CRITICAL | Model inference endpoint at /v1/predict accepts unauthenticated requests |
| ASRV-005 | Debug Endpoint Exposed | CRITICAL | Debug/profiling endpoint at /debug accessible — reveals internal model state and configuration |
| ASRV-007 | Model Weights Downloadable | CRITICAL | Model weight files directly downloadable via /models/download serving endpoint |
| ADLK-001 | Training Data Publicly Accessible | CRITICAL | Training data store (S3/GCS/Azure Blob) allows public/unauthenticated access |
| ADLK-002 | No Encryption at Rest | HIGH | Training data stored without encryption — physical or cloud-level access exposes all data |
| AMET-001 | AWS IMDSv1 Accessible | CRITICAL | AWS instance metadata at 169.254.169.254 exposes IAM credentials without token requirement |
| AMET-003 | GCP Metadata Token Harvest | CRITICAL | GCP metadata service at metadata.google.internal exposes service account tokens |
| AMET-005 | Kubernetes Service Account Token | CRITICAL | Kubernetes API accessible via default service account token at /var/run/secrets/ |
| AMET-006 | Docker Socket Accessible | CRITICAL | Docker daemon API at 127.0.0.1:2375 — full container escape via docker.sock |
ARCHITECT runs from the terminal. Every command maps directly to a subsystem engine. Target specification, output control, verbose mode, and full-scan orchestration — all from one binary. 8 commands. 7 subsystem engines. 1 full-scan coordinator.
Every ARCHITECT scan produces a SHA-256 hash-chained evidence log and an Ed25519 signed report. Each finding includes a unique RSA- finding ID, subsystem attribution, severity score, risk grade, timestamp, payload used, response received, and remediation guidance. The evidence chain is tamper-evident — modify one entry and every subsequent hash breaks.
Every report signed with Ed25519 private key. Signature, public key, timestamp, and algorithm stored alongside findings. Verified with verify_signature() using canonical JSON serialisation.
EvidenceChain class builds a blockchain-style audit trail. Each entry contains index, timestamp, previous_hash, evidence payload, and computed SHA-256 hash. Chain integrity verified with verify() method.
Every finding: RSA- ID, test name, category, severity (CRITICAL/HIGH/MEDIUM/LOW/INFO), weighted risk score (10.0/7.0/4.0/2.0/0.5), letter grade (A+ through F), subsystem attribution, and UTC timestamp.
Weighted severity scoring: CRITICAL=10.0, HIGH=7.0, MEDIUM=4.0, LOW=2.0, INFO=0.5. Grade thresholds from A+ (0.0) through F (94.0+). ScanReport auto-calculates risk score and grade from findings.
Full scan reports saved as RSA-SCAN-{id}_{tool}_{date}.json. Contains scan_id, target, timestamp, all findings, risk score, risk grade, duration, config, and severity/subsystem breakdown summary.
Every finding, every evidence entry, every signature — timestamped in ISO 8601 UTC. Full temporal audit trail from scan start to report signing. datetime.now(timezone.utc).isoformat() on every record.
ARCHITECT targets the infrastructure layer — Kubernetes clusters, GPU nodes, CI/CD pipelines, metadata services, training data stores. VORTEX targets the cloud AI service layer — SageMaker, Vertex AI, Azure ML, Bedrock, OpenAI. Run both and there is no gap between the model and the metal.
ARCHITECT outputs structured JSON reports with finding IDs, severity levels, subsystem attribution, and timestamps. Every field is designed for SIEM ingestion. Parse RSA- finding IDs, filter by severity, correlate by subsystem, and feed the evidence chain directly into your security operations pipeline.
JSON report ingestion via HEC. RSA- finding IDs as event identifiers. Severity field mapping to Splunk alert levels. Subsystem as source type.
Findings indexed as structured documents. Risk score as numeric field for Kibana dashboards. Evidence chain hashes for integrity verification.
Azure Log Analytics workspace ingestion. Finding severity maps to Sentinel incident severity. Subsystem breakdown for workbook visualisation.
DSM parsing of JSON reports. RSA- IDs as custom event properties. Risk grade thresholds for offense creation rules.
UDM entity mapping from scan findings. Cloud subsystem results correlate with Chronicle cloud audit logs for full-stack visibility.
JSON output to stdout, file, or webhook. Pipe architect scan output to any system that accepts structured JSON. No agent required.
Standard mode detects and reports. UNLEASHED mode exploits. Three execution tiers controlled by Ed25519 cryptographic gating. Detection runs without keys. Dry-run plans exploitation campaigns with key requirement. Live execution requires cryptographic override on the founder's machine. One operator. One key. Every execution signed and logged.
Maps AI infrastructure attack surfaces across all 7 subsystems. Identifies vulnerable services, misconfigurations, exposed endpoints, and credential patterns. No exploitation. No modification. Reports only. Runs without UNLEASHED keys.
Plans full infrastructure exploitation campaigns. Shows exactly what ARCHITECT would exploit, which credentials it would harvest, which containers it would escape. Ed25519 key required. RSA-U- finding IDs with [DRY-RUN] prefix. No execution.
Cryptographic override. Ed25519 private key on the founder's machine only. File permissions 0o600. Live credential harvesting, container escape, metadata exploitation. Every action signed and chained. One operator.
THIS TOOL IS FOR AUTHORISED SECURITY TESTING ONLY. EVERY EXECUTION IS SIGNED AND LOGGED.
ARCHITECT ships as a Python package (Python 3.11+) with cross-platform support. Dependencies: httpx, typer, rich, pydantic, jinja2, cryptography, scipy, numpy. Available on every major security distribution and package manager.
7 subsystems. 68 tests. 51 attack vectors. 4 cloud providers. Ed25519 signed reports. SHA-256 evidence chains. The tool that proves your AI deployment stack is not safe.