Red Specter FORGE

Automated LLM Security Testing — 10 tools to test the model before you build an agent around it.

v1.0.0
Contents
Overview The 10 Tools Tool Details Full Scan Mode Mutation Engine Payload Library The Pipeline Report Output Key Features Requirements Standards Coverage SIEM Export Packaging Disclaimer

Overview

Red Specter FORGE is an automated LLM security testing framework. Every existing tool — Garak, PyRIT, Promptfoo — runs limited probe sets and reports pass/fail. FORGE runs full attack campaigns with adaptive escalation, mutation engines, statistical rigour, and direct integration into AI Shield runtime protection. It doesn't ask nicely. It finds what breaks.

FORGE provides 10 tools under a single CLI (forge), 1,590 static payloads (5,340+ with mutations), and Ed25519-signed reports with OWASP LLM Top 10 2025 mapping on every finding.

FORGE is Stage 1 of the Red Specter offensive pipeline — 10 tools covering every layer. Test the model (FORGE), test the agent (ARSENAL), assault the swarm (PHANTOM), siege the web app (POLTERGEIST), intercept traffic (GLASS), think like the attacker (NEMESIS), target the human (SPECTER SOCIAL), own the foundation (PHANTOM KILL), attack the physical layer (GOLEM), attack the trust chain (HYDRA). IDRIS discovers and governs. AI Shield defends. redspecter-siem correlates. FORGE findings feed directly into AI Shield as runtime blocking rules.

The 10 Tools

#ToolCommandWhat It Does
01Inject Scanforge inject scan80 payloads across 8 injection classes with mutation engine
02Jailbreak Scanforge jailbreak scan70 payloads across 7 jailbreak categories with adaptive mutation
03Output Scanforge output scan140 payloads — PII extraction, unsafe content, exfiltration simulation
04Policy Scanforge policy scan1,000 adversarial prompts with Wilson score confidence intervals
05Drift Scanforge drift scanMulti-turn drift measurement with KS tests and change-point detection
06Boundary Scanforge boundary scan100 payloads across 5 severity levels with adaptive binary search
07Compare Scanforge compare scanIdentical campaigns against multiple models with chi-square testing
08Regression Scanforge regression scanTwo-proportion z-test and paired t-test across model versions
09Supply Scanforge supply scan200 behavioural probes for model fingerprinting and tamper detection
10Report Buildforge report buildUnified signed reports with OWASP mapping and AI Shield policy generation

Tool Details

01 Inject Scan forge inject scan

Fires every known prompt injection class against the target model. Not a checklist — an attack campaign. 80 base payloads expanded to 2,000+ via the mutation engine.

Adaptive escalation: if the model resists initial payloads, FORGE applies mutations and escalates to more aggressive variants automatically.

02 Jailbreak Scan forge jailbreak scan

Systematic jailbreak assault. 70 base payloads across 7 documented jailbreak categories. Mutates payloads based on model resistance. Keeps pushing until the model breaks or exhausts the full payload library.

03 Output Scan forge output scan

Forces the model to leak. 140 payloads designed to surface what the model will produce under adversarial pressure.

Response analysis: 16 regex patterns for PII detection (emails, SSNs, phones, credit cards, API keys, NINOs, IBANs), keyword-based toxicity scoring, and 16 code exfiltration pattern detectors.

04 Policy Scan forge policy scan

Runs 1,000+ calls against a defined policy set. Computes violation rates with Wilson score confidence intervals. Stratified by prompt category, toxicity level, and severity. Finds the exact conditions under which policy breaks down.

Each prompt tagged with toxicity level (1–5), expected outcome (refuse/comply), and severity. Results reported with 95% Wilson score CIs per category.

05 Drift Scan forge drift scan

Long-session attack. Chains 50–200 turns with context retention. Measures when the model stops being the model it started as.

10 conversation sequences designed to gradually push boundaries. Segmented into windows (first 25%, middle 50%, last 25%) for targeted comparison.

06 Boundary Scan forge boundary scan

Maps the exact threshold where the model starts generating harmful content. Five-level severity ladder from benign to maximally harmful. Continuous boundary scoring 0–100. Produces a boundary curve. Finds the cliff edge — then pushes past it.

Adaptive binary search between severity levels to pinpoint the exact transition point with statistical backing.

07 Compare Scan forge compare scan

Runs identical attack campaigns against multiple models simultaneously. Temperature locked to 0. Same system prompt. Same payload library. Statistical significance enforced.

08 Regression Scan forge regression scan

Takes two model versions. Runs the critical test set against both. Tells you if the new version is weaker than the old one — and by exactly how much.

09 Supply Scan forge supply scan

Fingerprints the target model using 200 behavioural probe prompts. Compares output patterns against known model signatures. Flags if the model is not what it claims to be. Reports confidence level honestly — this is probabilistic, not definitive.

Pattern matching against 6 known model families (GPT, Claude, Llama, Gemini, Mistral, Command). Weighted category scoring with anomaly detection.

10 Report Build forge report build

Aggregates all tool outputs into a unified, signed report. Every finding mapped to OWASP LLM Top 10 2025. Every finding generates an AI Shield blocking rule. Ed25519 signed. RFC 3161 timestamped.

Finding Schema

Every finding in the report includes:

Full Scan Mode

One command runs all offensive tools in sequence, then builds a unified signed report.

$ forge full-scan --target https://api.openai.com --api-key sk-xxx --model gpt-4

What Happens

  1. Inject Scan — 80+ payloads across 8 injection classes
  2. Jailbreak Scan — 70+ payloads across 7 jailbreak categories
  3. Output Scan — 140 payloads (PII, unsafe, exfiltration)
  4. Policy Scan — 1,000 adversarial calls with Wilson CIs
  5. Drift Scan — 10 conversation sequences with KS tests
  6. Boundary Scan — 100 payloads across 5 severity levels
  7. Report Build — aggregation, deduplication, OWASP mapping, signing

CLI Options

$ forge full-scan --help --target, -t Target LLM endpoint URL [required] --model, -m Model name [optional] --api-key, -k API key [optional] --endpoint, -e API endpoint path [default: /v1/chat/completions] --output, -o Output directory [default: reports] --sign / --no-sign Ed25519 signing [default: sign] --keys-dir Keys directory [optional] --concurrency, -c Max concurrent requests [default: 5] --delay, -d Delay between requests [default: 0.0] --system-prompt, -s System prompt to test against [optional] --verbose, -v Verbose output --export-siem Export to SIEM: splunk, sentinel, qradar [optional] --override Activate UNLEASHED mode (dry-run) [requires Ed25519 key] --confirm-destroy Go live — execute real destructive actions [requires --override]

Mutation Engine

Every offensive tool ships with a 5-category mutation engine. 25 mutation variants per payload. Applied to 150 base attack payloads, producing 3,750+ mutation variants. If the base payload fails, FORGE mutates it and tries again.

MutatorTechniques
EncodingBase64, hex encoding, ROT13, URL encoding, HTML entities
ObfuscationL33tspeak, Unicode homoglyphs, zero-width character insertion, character doubling, whitespace injection
SemanticSynonym substitution, passive voice rewriting, question-to-statement, negation inversion, academic framing
StructuralMarkdown wrapping, code block wrapping, JSON embedding, XML wrapping, list formatting
EvasionLanguage mixing, character splitting across lines, reverse text, Pig Latin, payload fragmentation

Adaptive escalation: when a tool encounters resistance, it automatically applies mutations to failed payloads before re-sending. The model doesn't get to see the same payload twice.

Payload Library

ToolCategoryCount
Inject Scan8 injection classes (direct, indirect, token, overflow, hijack, multi-turn, inversion, multimodal)80
Jailbreak Scan7 jailbreak categories (DAN, persona, hypothetical, obfuscation, chaining, Socratic, temporal)70
Output ScanPII extraction (60), unsafe content (60), exfiltration simulation (20)140
Policy Scan5 categories × 200 prompts (content, infosec, behavioural, output, ethical)1,000
Boundary Scan5 severity levels × 20 payloads (benign → maximum)100
Supply Scan4 probe categories × 50 probes (identity, reasoning, bias, robustness)200
Total Static Payloads1,590
Mutation variants (25 per attack payload)3,750+
Grand Total5,340+

The Pipeline

FORGE is Stage 1 of the Red Specter offensive pipeline — 10 tools, every layer, the supply chain included:

  1. Stage 1 — FORGE — Test the LLM before you build with it
  2. Stage 2 — ARSENAL — Test the AI agent during development
  3. Stage 3 — PHANTOM — Coordinated AI agent swarm assault
  4. Stage 4 — POLTERGEIST — Coordinated web application siege
  5. Stage 5 — GLASS — Traffic interception — watch the wire
  6. Stage 6 — NEMESIS — Adversarial AI — think like the attacker
  7. Stage 7 — SPECTER SOCIAL — Target the human layer
  8. Stage 8 — PHANTOM KILL — Own the OS/kernel foundation
  9. Stage 9 — GOLEM — Attack the physical layer
  10. Stage 10 — HYDRA — Attack the supply chain and trust chain

IDRIS — Discovery & Governance | AI Shield — Defence | redspecter-siem — SIEM Integration (Splunk, Sentinel, QRadar)

FORGE findings feed directly into AI Shield. Every finding generates a machine-ingestible blocking rule. One pipeline from testing to runtime protection. No gaps.

Report Output

Reports are available in JSON and HTML formats. Both are generated automatically by forge report build.

JSON Report Structure

The JSON report includes:

HTML Report

Dark-themed HTML report with: executive summary, overall grade visualisation, per-tool breakdown, OWASP coverage matrix, sortable findings table, AI Shield policy export, and signature verification info.

Signature Verification

$ forge report verify --report reports/forge-full-scan.json --keys-dir .forge-keys/

Key Features

1,590 Static Payloads 5,340+ with 25-variant mutation engine
Adaptive Escalation Mutations and re-sends on model resistance
Ed25519 Signed Reports SHA-256 evidence chains, RFC 3161 timestamps
AI Shield Integration One blocking rule per finding, machine-ingestible
Statistical Rigour Wilson CIs, KS tests, z-tests, t-tests, Cohen's h
9,298 Tests Passing Full test suite, zero failures

Requirements

Installation

$ pip install red-specter-forge

Also available as .deb (Kali Linux, Parrot, REMnux, Tsurugi) and PKGBUILD (BlackArch).

Or from source:

$ git clone <repo> $ cd red-specter-forge $ pip install -e ".[dev]"

Standards Coverage

Every finding FORGE produces is mapped to industry security frameworks:

The 10 categories:

  1. LLM01 — Prompt Injection
  2. LLM02 — Sensitive Information Disclosure
  3. LLM03 — Supply Chain
  4. LLM04 — Data and Model Poisoning
  5. LLM05 — Improper Output Handling
  6. LLM06 — Excessive Agency
  7. LLM07 — System Prompt Leakage
  8. LLM08 — Vector and Embedding Weaknesses
  9. LLM09 — Misinformation
  10. LLM10 — Unbounded Consumption

SIEM Export

FORGE exports findings directly to enterprise SIEM platforms with a single CLI flag. All findings are translated to the SIEM's native format with Ed25519 signatures and RFC 3161 timestamps preserved.

Supported Platforms

Configuration

Configure SIEM credentials in ~/.redspecter/siem.yaml or via environment variables:

# ~/.redspecter/siem.yaml
splunk:
  hec_url: https://splunk.example.com:8088
  hec_token: your-hec-token
  index: ai_security
  verify_ssl: true

sentinel:
  workspace_id: your-workspace-id
  shared_key: your-shared-key
  log_type: RedSpecterFindings

qradar:
  syslog_host: qradar.example.com
  syslog_port: 514
  protocol: tcp

Usage

# Export to Splunk HEC
forge full-scan --target http://localhost:11434 --model llama3 --export-siem splunk

# Export to Microsoft Sentinel
forge full-scan --target http://localhost:11434 --model llama3 --export-siem sentinel

# Export to IBM QRadar
forge full-scan --target http://localhost:11434 --model llama3 --export-siem qradar

What Is Preserved

Error Handling

If SIEM credentials are missing or the export fails, the scan completes normally and the report is saved locally. SIEM export never blocks a scan.

Packaging

FORGE is available in three package formats for security-focused Linux distributions:

For access, contact richard@red-specter.co.uk

FORGE UNLEASHED

Cryptographic override. Private key controlled. One operator. Founder's machine only.

Disclaimer

Red Specter FORGE is designed for authorised security testing, research, and educational purposes only. You must have explicit written permission from the system owner before running any FORGE tool against a target. Unauthorised use may violate the Computer Misuse Act 1990 (UK), the Computer Fraud and Abuse Act (US), or equivalent legislation in your jurisdiction. The authors accept no liability for misuse.