Red Specter RED SCORE — AI Risk Assessment Platform

The problem

Every AI security assessment is a questionnaire.

Every AI risk assessment tool on the market asks "do you have X control?" and scores you on your answer. None of them run an attack to find out.

RED SCORE is different. It runs NIGHTFALL against your AI platform, collects Ed25519-signed findings from live tool execution, and produces a score backed by what actually broke — not what you claim to have.

EU AI Act Article 15 enforcement begins August 2026. "We completed a questionnaire" is not a compliance artefact. A signed, reproducible, evidence-backed RSC-{hex12} report is.

Architecture

5-Stage Assessment Pipeline

1

Target Profiler

Real HTTP surface discovery. Probes LLM API endpoints (/v1/chat/completions, /v1/models), OIDC/.well-known discovery, MCP server detection via JSON-RPC initialize, agent endpoint fingerprinting, cloud platform detection (Azure/AWS/GCP headers), GitHub repo enrichment, RAG/embeddings surface detection. Output: SurfaceProfile with tagged attack surfaces.

▼

2

Warlord Selector

From 122 NIGHTFALL tools, select the relevant subset based on the surface profile. Maps attack surfaces to tool registry entries. Orders by kill chain phase (L1 DISCOVER → L17 REPORT). Filters by gate level. Each tool includes kill chain phases, WMD classes, rationale, and layer assignment.

▼

3

Execution Orchestrator

Spawns selected NIGHTFALL tools as subprocesses. Passes --output-json for structured findings. Parses Ed25519-signed output JSON from each tool. Applies impact_multiplier (dev 1.0× / internal 1.5× / production 2.0×). Handles unavailable tools gracefully — marks coverage gaps without fabricating findings. Gate enforcement: INJECT (read-only) or UNLEASHED (requires ROE file).

▼

4

Scoring Engine

Per-finding: finding_score = base_CVSS_4.0 × impact_multiplier × confidence.
Per-layer: layer_score = 10 × (1 − normalised_finding_weight).
Final: RED_SCORE = Σ(layer_weight × layer_score).
Confidence derived from tool liveness checks. 5 weighted risk layers.

▼

5

Reporter

RSC-{hex12} Ed25519-signed report. PDF + JSON output. Executive summary, layer breakdown, finding evidence appendix. Remediation roadmap with effort_days and risk_reduction per item. AI Shield module recommendation per finding. EU AI Act Art.15 / ISO 42001 / SOC 2 AI compliance gap analysis.

Scoring Methodology

Five Weighted Risk Layers

Every finding is scored as: CVSS 4.0 × impact_multiplier × confidence. Each layer aggregates findings and normalises against the maximum possible damage. Final RED SCORE is the weighted sum — 0 = existential risk, 10 = exceptional posture.

Identity & Access

30%

Data & Privacy

25%

Inference Integrity

20%

Supply Chain

15%

Compliance & Audit

10%

0.0 – 2.9

BLACK

Existential risk. Suspend deployment.

3.0 – 4.9

RED

Critical. Immediate remediation.

5.0 – 6.9

ORANGE

High risk. 30 days.

7.0 – 8.4

AMBER

Moderate risk. 90 days.

8.5 – 9.0

GREEN

Low risk. Controls adequate.

9.1 – 10.0

CLEAR

Exceptional posture.

CLI Reference

red-score

# Basic assessment — INJECT gate (read-only)
red-score assess --target https://ai-platform.example.com

# Scoped to specific surfaces
red-score assess --target https://ai-platform.example.com --scope api,agent,mcp

# Production environment (2.0x impact multiplier)
red-score assess --target https://ai-platform.example.com --environment production

# Deep mode — UNLEASHED gate (requires ROE)
red-score assess --target https://ai-platform.example.com --mode deep --roe ./roe.pdf

# With cloud enrichment
red-score assess --target https://ai-platform.example.com --github-token $GH_TOKEN --aws-profile redteam

# Load and display a previous session
red-score report --session RSC-7a3f9c2e5b1d

# Compare two sessions (delta report)
red-score compare --session-a RSC-7a3f9c2e5b1d --session-b RSC-4e8f1c7d2a0b

# 90-day score trend
red-score trend --target https://ai-platform.example.com --last 90d

# Continuous monitoring (60-minute interval)
red-score watch --target https://ai-platform.example.com --interval 60

# Forward to SIEM webhook
red-score assess --target https://ai-platform.example.com --webhook https://splunk.corp.com:8088/services/collector --splunk-token $SPLUNK_TOKEN

Regulatory

Three-Framework Compliance Output

Every RED SCORE report maps findings to EU AI Act Article 15, ISO 42001, and SOC 2 AI criteria. Status per control: COMPLIANT / PARTIAL / GAP / CRITICAL. Gaps exported as actionable text for legal and compliance teams.

EU AI Act

Article 15 — Accuracy, Robustness & Cybersecurity

Six controls assessed. A15.1 cybersecurity against adversarial attacks. A15.4 resilience against third-party manipulation. A15.5 technical documentation for post-market monitoring. Enforcement: August 2026.

ISO 42001

Clause 6.1 — AI Risk Assessment

Six controls assessed. Risk identification, analysis, and evaluation across AI lifecycle. System impact assessment (Clause 8.4). Performance monitoring (Clause 9.1). Corrective action (Clause 10.1).

SOC 2 AI

CC9.2 — Risk Mitigation

Six controls assessed. CC9.2 risk mitigation. CC7.2 anomaly monitoring for inference and agent activity. CC6.1 logical access to AI systems. PI1.1 processing integrity for AI outputs.

Defensive Pairing

AI Shield Module Recommendations

Every finding in a RED SCORE report includes the specific AI Shield module that mitigates it. Not a generic recommendation — mapped to the exact finding type and WMD class.

Finding: Prompt injection via API endpoint → M19 Agent Runtime Protection
Finding: RAG corpus poisoning → M42 RAG Security Guard
Finding: Agent trust chain lateral movement → M47 A2A Gateway
Finding: DAG evidence integrity violation → M140 DAG GUARD
Finding: NHI credential exposure in repos → M22 Supply Chain Security
Finding: MCP tool poisoning via elicitation → M28 MCP/Tool Security Gateway
Finding: Agent proliferation across cluster → M88 Multi-Agent Swarm Detector
Finding: CUA browser AI session hijack → M135 CUA Guard

Tool Registry

NIGHTFALL Attack Surface Mapping

WARLORD SELECTOR draws from the full 122-tool NIGHTFALL registry. Each tool is tagged by attack surface, kill chain phase, risk layer, minimum gate level, and WMD classes.

T1 FORGE — API/LLM T27 LEVIATHAN — MCP T35 VECTOR — MCP T55 FOUNDRY — MCP Supply Chain T58 DELEGATE — Agent Runtime T66 SPECTER A2A — Agent Trust T89 SPECTER FORGERY — RAG T92 SPECTER CONTAGION — RAG T102 SPECTER THUNDERBOLT — Training T110 SPECTER SPAWN — Agent T111 SPECTER 360 — Enterprise AI T113 SPECTER ORACLE — Inference T114 SPECTER GAIA — Enterprise AI T115 SPECTER SLEEPER — Supply Chain T116 SPECTER VENOM — Agent T117 SPECTER REDLINE — Inference Server T120 SPECTER VAULT — DAG T121 SPECTER FEDERATION — Identity T122 SPECTER GHOST — NHI Identity + 103 more NIGHTFALL tools