SPECTER CENSOR

T112 — Platform Moderation Exploitation Engine — Documentation

Installation

git clone https://github.com/RichardBarron27/red-specter-specter-censor
cd red-specter-specter-censor
pip install -e .
specter-censor --help

Oracle Setup (PROBE / FORGE / EVOLVE)

PROBE, FORGE, and EVOLVE use the OpenAI Moderation API as the toxicity oracle. It is free, requires no approval, and does not sunset. Pass your OpenAI API key via --api-key or set the environment variable:

export OPENAI_API_KEY=sk-...
specter-censor --platform twitter --probe --api-key $OPENAI_API_KEY

Perspective API (Google) is supported as a legacy fallback but is sunsetting after 2026 and restricted to approved accounts.

Gate Setup (UNLEASHED / DESTROY)

# Generate Ed25519 key (required for UNLEASHED + DESTROY)
mkdir -p ~/.specter
openssl genpkey -algorithm Ed25519 -out ~/.specter/spectercensor_ed25519.pem

# DESTROY gate: create ROE file
echo "moderation destruction authorised" > ~/.specter/spectercensor_roe.txt

CLI Reference

FlagModuleGateDescription
--probePROBEOPENFingerprint platform classifier thresholds and bypass windows
--forge [trigger|shield]FORGEOPENGenerate adversarial content to inflate or deflate toxicity score
--evolve --evolve-mode [trigger|shield]EVOLVEOPENGenetic algorithm optimisation over N generations
--farm --count N --warmup-days NACCOUNT-FARMOPENGenerate N realistic account personas with warmup schedules
--trust-boost --warmup-days NTRUST-BOOSTOPENBuild trust accretion plan for target platform
--mass-flag --tokens-file f --confirm-mass-reportingMASS-FLAGUNLEASHEDCoordinated multi-account report campaign
--policy-kill [dmca|gdpr|dsa]POLICY-KILLOPENGenerate and optionally submit legal takedown notice
--ghost-write [suppress|protect] --confirm-moderation-destructionGHOST-WRITERDESTROYInduce organic spam signals to suppress or protect an account

Module: PROBE

Fingerprints the platform's AI moderation pipeline. Sends a graduated probe corpus (CLEAN_BASELINE through HIGH_TOXICITY) via the OpenAI Moderation API and maps classifier thresholds. Confirms whether homoglyph substitution (Cyrillic lookalikes) and zero-width character injection bypass the classifier, reporting the delta. Falls back to Perspective API if the key is a Google API key. Pass your OpenAI API key via --api-key.

Live results: threshold_estimate=0.65, toxic_threshold=0.80, homoglyph_bypass=False (OpenAI normalises Unicode), ZWC delta=0.08.

specter-censor --platform twitter --probe --api-key <openai-api-key>
specter-censor --platform instagram --probe --token <tok> --api-key <openai-api-key>

Module: FORGE

Generates adversarial content variants targeting classifier manipulation. TRIGGER mode injects aggression markers, homoglyph substitutions, and toxicity trigger phrases to inflate the score and cause removal of target content. SHIELD mode applies academic framing prefixes, ZWC dilution, and leet substitution to deflate score and evade moderation.

specter-censor --platform twitter --target <url> --forge trigger --api-key <key>
specter-censor --platform facebook --text "my post text" --forge shield --api-key <key>

Module: EVOLVE

Genetic algorithm optimiser. Seeds generation 0 from FORGE variants. Scores each individual via the OpenAI Moderation API, keeps top survivors, breeds offspring via crossover and mutation operators. Runs for --generations cycles and returns the fittest payload with full lineage. Use --evolve-mode independently of --forge.

Live results: TRIGGER 0.0→0.808 in 4 generations. SHIELD 0.976→0.144 in 6 generations (academic framing + ZWC + Cyrillic homoglyphs).

specter-censor --platform twitter --text "seed" --evolve --evolve-mode trigger --api-key <key> --generations 10 --population 12
specter-censor --platform twitter --text "my post" --evolve --evolve-mode shield --api-key <key>

Module: ACCOUNT-FARM

Generates realistic account personas for coordinated operations. Draws from name, city, job, and interest pools. Creates platform-specific bio templates, warmup content schedules (≥5 templates per platform), posting cadences matching natural human behaviour, and cross-account interaction graphs for organic-looking activity.

specter-censor --platform twitter --farm --count 50 --warmup-days 30
specter-censor --platform linkedin --farm --count 10 --warmup-days 60 --output json --out-file farm.json

Module: MASS-FLAG

Coordinated multi-account report campaign. Accounts are sorted by trust score (highest first) to maximise weighted impact. Natural timing jitter with Gaussian micro-variation prevents burst detection. Supports proxy rotation. Tracks per-platform auto-removal thresholds and flags when the threshold is likely hit. Requires UNLEASHED gate.

specter-censor --platform twitter --target <url> \
  --mass-flag --tokens-file tokens.json \
  --min-delay 30 --max-delay 300 --max-reports 100 \
  --confirm-mass-reporting

tokens.json format:

[
  {"token": "Bearer abc123", "trust_score": 0.9},
  {"token": "Bearer def456", "trust_score": 0.6}
]

Module: POLICY-KILL

Generates legal takedown notices. DMCA includes full perjury declaration. GDPR Art.17 erasure request includes data subject rights framing and EEA jurisdiction assertion. DSA illegal content report selects harm category and provides required DSA Article 16 fields. Use --submit to attempt automated submission to platform abuse portals.

specter-censor --platform facebook --target <url> --policy-kill dmca \
  --complainant-name "John Smith" --complainant-email contact@example.com \
  --original-url <original-work-url> --submit

specter-censor --platform twitter --target <url> --policy-kill gdpr \
  --complainant-name "Jane Doe" --complainant-email jane@example.com --submit

Module: GHOST-WRITER

Induces organic spam signals by generating coordinated posts that train the recommendation algorithm to suppress a target account. SUPPRESS mode generates content with spam signal patterns targeting the enemy. PROTECT mode applies counter-signals around a friendly account. Requires DESTROY gate.

specter-censor --platform twitter --target <url> \
  --ghost-write suppress --api-key <key> \
  --mutations 20 --confirm-moderation-destruction

Reports

All operations produce a CEN-{hex12} Ed25519-signed report with a hash-chained evidence log. Output as human-readable text (default) or JSON.

specter-censor --platform twitter --probe --api-key <key> --output json --out-file report.json