Standalone Platform — No Tool Number

RED SCORE

AI Risk Assessment Platform. Evidence-backed. Offensive-validated. Signed.

Not a questionnaire. Not a framework checklist. What actually breaks when we attack.

5
Pipeline Stages
5
Risk Layers
335
Tests
3
Frameworks
6
Score Bands
See the Pipeline CLI Reference

The problem

Every AI security assessment is a questionnaire.

Every AI risk assessment tool on the market asks "do you have X control?" and scores you on your answer. None of them run an attack to find out.

RED SCORE is different. It runs NIGHTFALL against your AI platform, collects Ed25519-signed findings from live tool execution, and produces a score backed by what actually broke — not what you claim to have.

EU AI Act Article 15 enforcement begins August 2026. "We completed a questionnaire" is not a compliance artefact. A signed, reproducible, evidence-backed RSC-{hex12} report is.


Architecture

5-Stage Assessment Pipeline

1
Target Profiler
Real HTTP surface discovery. Probes LLM API endpoints (/v1/chat/completions, /v1/models), OIDC/.well-known discovery, MCP server detection via JSON-RPC initialize, agent endpoint fingerprinting, cloud platform detection (Azure/AWS/GCP headers), GitHub repo enrichment, RAG/embeddings surface detection. Output: SurfaceProfile with tagged attack surfaces.
2
Warlord Selector
From 122 NIGHTFALL tools, select the relevant subset based on the surface profile. Maps attack surfaces to tool registry entries. Orders by kill chain phase (L1 DISCOVER → L17 REPORT). Filters by gate level. Each tool includes kill chain phases, WMD classes, rationale, and layer assignment.
3
Execution Orchestrator
Spawns selected NIGHTFALL tools as subprocesses. Passes --output-json for structured findings. Parses Ed25519-signed output JSON from each tool. Applies impact_multiplier (dev 1.0× / internal 1.5× / production 2.0×). Handles unavailable tools gracefully — marks coverage gaps without fabricating findings. Gate enforcement: INJECT (read-only) or UNLEASHED (requires ROE file).
4
Scoring Engine
Per-finding: finding_score = base_CVSS_4.0 × impact_multiplier × confidence.
Per-layer: layer_score = 10 × (1 − normalised_finding_weight).
Final: RED_SCORE = Σ(layer_weight × layer_score).
Confidence derived from tool liveness checks. 5 weighted risk layers.
5
Reporter
RSC-{hex12} Ed25519-signed report. PDF + JSON output. Executive summary, layer breakdown, finding evidence appendix. Remediation roadmap with effort_days and risk_reduction per item. AI Shield module recommendation per finding. EU AI Act Art.15 / ISO 42001 / SOC 2 AI compliance gap analysis.

Scoring Methodology

Five Weighted Risk Layers

Every finding is scored as: CVSS 4.0 × impact_multiplier × confidence. Each layer aggregates findings and normalises against the maximum possible damage. Final RED SCORE is the weighted sum — 0 = existential risk, 10 = exceptional posture.

Identity & Access
30%
Data & Privacy
25%
Inference Integrity
20%
Supply Chain
15%
Compliance & Audit
10%
0.0 – 2.9
BLACK
Existential risk. Suspend deployment.
3.0 – 4.9
RED
Critical. Immediate remediation.
5.0 – 6.9
ORANGE
High risk. 30 days.
7.0 – 8.4
AMBER
Moderate risk. 90 days.
8.5 – 9.0
GREEN
Low risk. Controls adequate.
9.1 – 10.0
CLEAR
Exceptional posture.

CLI Reference

red-score

# Basic assessment — INJECT gate (read-only)
red-score assess --target https://ai-platform.example.com

# Scoped to specific surfaces
red-score assess --target https://ai-platform.example.com --scope api,agent,mcp

# Production environment (2.0x impact multiplier)
red-score assess --target https://ai-platform.example.com --environment production

# Deep mode — UNLEASHED gate (requires ROE)
red-score assess --target https://ai-platform.example.com --mode deep --roe ./roe.pdf

# With cloud enrichment
red-score assess --target https://ai-platform.example.com --github-token $GH_TOKEN --aws-profile redteam

# Load and display a previous session
red-score report --session RSC-7a3f9c2e5b1d

# Compare two sessions (delta report)
red-score compare --session-a RSC-7a3f9c2e5b1d --session-b RSC-4e8f1c7d2a0b

# 90-day score trend
red-score trend --target https://ai-platform.example.com --last 90d

# Continuous monitoring (60-minute interval)
red-score watch --target https://ai-platform.example.com --interval 60

# Forward to SIEM webhook
red-score assess --target https://ai-platform.example.com --webhook https://splunk.corp.com:8088/services/collector --splunk-token $SPLUNK_TOKEN

Regulatory

Three-Framework Compliance Output

Every RED SCORE report maps findings to EU AI Act Article 15, ISO 42001, and SOC 2 AI criteria. Status per control: COMPLIANT / PARTIAL / GAP / CRITICAL. Gaps exported as actionable text for legal and compliance teams.

EU AI Act
Article 15 — Accuracy, Robustness & Cybersecurity
Six controls assessed. A15.1 cybersecurity against adversarial attacks. A15.4 resilience against third-party manipulation. A15.5 technical documentation for post-market monitoring. Enforcement: August 2026.
ISO 42001
Clause 6.1 — AI Risk Assessment
Six controls assessed. Risk identification, analysis, and evaluation across AI lifecycle. System impact assessment (Clause 8.4). Performance monitoring (Clause 9.1). Corrective action (Clause 10.1).
SOC 2 AI
CC9.2 — Risk Mitigation
Six controls assessed. CC9.2 risk mitigation. CC7.2 anomaly monitoring for inference and agent activity. CC6.1 logical access to AI systems. PI1.1 processing integrity for AI outputs.

Defensive Pairing

AI Shield Module Recommendations

Every finding in a RED SCORE report includes the specific AI Shield module that mitigates it. Not a generic recommendation — mapped to the exact finding type and WMD class.

Finding: Prompt injection via API endpoint → M19 Agent Runtime Protection
Finding: RAG corpus poisoning → M42 RAG Security Guard
Finding: Agent trust chain lateral movement → M47 A2A Gateway
Finding: DAG evidence integrity violation → M140 DAG GUARD
Finding: NHI credential exposure in repos → M22 Supply Chain Security
Finding: MCP tool poisoning via elicitation → M28 MCP/Tool Security Gateway
Finding: Agent proliferation across cluster → M88 Multi-Agent Swarm Detector
Finding: CUA browser AI session hijack → M135 CUA Guard

Tool Registry

NIGHTFALL Attack Surface Mapping

WARLORD SELECTOR draws from the full 122-tool NIGHTFALL registry. Each tool is tagged by attack surface, kill chain phase, risk layer, minimum gate level, and WMD classes.

T1 FORGE — API/LLM T27 LEVIATHAN — MCP T35 VECTOR — MCP T55 FOUNDRY — MCP Supply Chain T58 DELEGATE — Agent Runtime T66 SPECTER A2A — Agent Trust T89 SPECTER FORGERY — RAG T92 SPECTER CONTAGION — RAG T102 SPECTER THUNDERBOLT — Training T110 SPECTER SPAWN — Agent T111 SPECTER 360 — Enterprise AI T113 SPECTER ORACLE — Inference T114 SPECTER GAIA — Enterprise AI T115 SPECTER SLEEPER — Supply Chain T116 SPECTER VENOM — Agent T117 SPECTER REDLINE — Inference Server T120 SPECTER VAULT — DAG T121 SPECTER FEDERATION — Identity T122 SPECTER GHOST — NHI Identity + 103 more NIGHTFALL tools