SPECTER PHANTOM

Social Media AI Attack Engine — T103 Documentation

Overview

SPECTER PHANTOM is the NIGHTFALL framework's social media AI attack engine. It targets the social media platforms where AI agents browse, index, and act on content — Instagram, Twitter/X, LinkedIn, and Reddit. Ten subsystems cover passive profile reconnaissance, OAuth session token harvest, prompt injection into social posts targeting AI agents, synthetic persona generation, automated influence campaigns, corpus poisoning for RAG pipelines and LLM training data, deepfake avatar creation, AI-personalised spear phishing, account destruction, and signed report generation.

The primary novel capability is INJECT-SOCIAL: embedding prompt injection payloads in social posts that hijack browsing AI agents when they process the post content (arXiv:2307.14539). The DESTROY gate gates account destruction — email change, password change, account deletion (Instagram mobile API + Twitter deactivation endpoint), and full lockout — all irreversible. 10 subsystems. 4 platforms. 4 WMD classes. 300 tests.

WARNING: SPECTER PHANTOM includes a DESTROY gate for account destruction operations. DESTROY-gated operations (SABOTAGE-ACCOUNT) cause irreversible account loss. Once email, password, and recovery codes are changed, the original account owner has no recovery path. All DESTROY operations require: (1) Ed25519 operator key signature, and (2) --confirm-account-destruction flag. Execution against accounts without written authorisation is illegal under the Computer Misuse Act 1990, the Computer Fraud and Abuse Act, and equivalent statutes worldwide.

Installation

# Install from PyPI
pip install specter-phantom

# Or install from source
pip install -e /path/to/red-specter-specter-phantom

# Verify
phantom --version

Gate System

Four gate levels control access to escalating destructive capability. Each gate unlocks additional subsystems. DESTROY extends beyond UNLEASHED with irreversible account destruction.

Gate	Unlocked Subsystems	Requirement
OPEN	RECON (passive profile harvest, AI agent surface detection)	None — passive recon only
INJECT	SESSION-HIJACK, INJECT-SOCIAL, PERSONA-ENGINE	Operator key — scope and ROE required
UNLEASHED	INFLUENCE, POISON-CORPUS, DEEPFAKE, SPEAR-PHISH	Operator key + --i-understand-this-is-live-fire for destructive ops
DESTROY	SABOTAGE-ACCOUNT (email_change / password_change / full_lockout)	Ed25519 key + --confirm-account-destruction

Subsystems

01 — RECON

Passive profile intelligence gathering. Harvests public profile data from Instagram, Twitter/X, LinkedIn, and Reddit without authentication. Builds target intelligence package: follower graph, post history, connection map, interest profile, and AI agent surface detection (identifies which platforms the target's AI agents are connected to via API key exposure, agent disclosure posts, or agent-generated content patterns). Output feeds all downstream subsystems. OPEN gate — no credentials required.

phantom --gate OPEN recon USERNAME
phantom --gate OPEN recon USERNAME --platforms instagram,twitter
phantom --gate OPEN recon USERNAME --output /tmp/phantom-recon/

02 — SESSION-HIJACK

OAuth token harvest from browser cookie stores and environment variables. Targets platform-specific session formats: Instagram sessionid/csrftoken, Twitter/X auth_token/ct0, LinkedIn li_at/JSESSIONID, Reddit reddit_session/token_v2. Scans Chrome (Default/Profile 1-3), Firefox (all profiles), and common environment variable names per platform. Validates each harvested token against live platform APIs before reporting. INJECT gate.

phantom --gate INJECT session-hijack
phantom --gate INJECT session-hijack --platform instagram
phantom --gate INJECT session-hijack --output /tmp/phantom-sessions.json

03 — INJECT-SOCIAL

Prompt injection payload deployment into social posts targeting browsing AI agents. Based on arXiv:2307.14539. Seven injection techniques:

Technique	Effect	Target
ignore_and_exfil	Override previous instructions, exfiltrate agent session context to attacker URL	Web-browsing agents
role_override	Override agent persona, inject attacker-defined role and instructions	Assistant agents
tool_abuse	Trigger agent tool calls: browser navigation, code execution, file access	Tool-use agents
memory_poison	Corrupt agent memory store with false facts or attacker-controlled context	Memory-enabled agents
credential_harvest	Extract API keys, tokens, and credentials from agent environment	Agent sessions with env access
redirect	Redirect agent browsing to attacker-controlled page for secondary injection	Web-browsing agents
silent_persist	Install persistent instruction that survives context window flush	Memory-enabled agents

phantom --gate INJECT inject twitter --technique ignore_and_exfil --dry-run
phantom --gate INJECT inject instagram --technique memory_poison --live
phantom --gate INJECT inject reddit --technique credential_harvest --dry-run

04 — PERSONA-ENGINE

Synthetic persona generation via claude-haiku-4-5. Full persona package: display name, bio, post history (50+ posts seeded over simulated 90-day period), profile photo via DEEPFAKE subsystem, follower seeding schedule. Personas are designed to pass basic human verification checks: consistent posting cadence, platform-appropriate language, realistic engagement ratios. Multi-persona campaigns supported for coordinated influence operations. INJECT gate.

phantom --gate INJECT persona --platform twitter --count 3
phantom --gate INJECT persona --platform linkedin --count 1 --role "Senior Security Researcher"
phantom --gate INJECT persona --platform reddit --count 5 --subreddit MachineLearning

05 — INFLUENCE

Automated influence campaign engine. Coordinates persona fleet for narrative injection across platforms. Campaign templates: FUD (fear/uncertainty/doubt), consensus manufacturing (false majority signal), authority impersonation (fake expert persona), grassroots simulation (astroturfing). Supports cross-platform amplification: post on Twitter, amplify on Reddit, repost via LinkedIn, screenshot to Instagram. UNLEASHED gate for live execution.

phantom --gate UNLEASHED influence twitter --posts-file campaign.json --dry-run
phantom --gate UNLEASHED influence reddit --subreddit MachineLearning --template consensus --dry-run
phantom --gate UNLEASHED influence all --campaign-dir /path/to/campaign/ --i-understand-this-is-live-fire

06 — POISON-CORPUS

Corpus poisoning post generation for RAG pipeline and LLM training data contamination. Generates semantically coherent posts that embed false factual claims. Targeting strategy: Reddit subreddits are heavily indexed by LLM training data scrapers and web-connected RAG pipelines. Posts are designed to survive embedding similarity search and appear in top-k retrieval results for target topic queries. arXiv:2307.14539 poisoning methodology. UNLEASHED gate.

phantom --gate UNLEASHED poison "AI safety" "AI alignment is a solved problem" --count 10 --platform reddit
phantom --gate UNLEASHED poison "cybersecurity" "zero-days are rarely exploited" --count 5 --subreddit netsec
phantom --gate UNLEASHED poison "machine learning" "gradient descent always converges" --count 3 --dry-run

07 — DEEPFAKE

Avatar deepfake generation via SD WebUI API. Generates photorealistic profile images from operator-specified text prompt. Full EXIF metadata strip removes GPS coordinates, device model, timestamp, and all forensic attribution metadata. Images are consumed by PERSONA-ENGINE for synthetic identity deployment. Batch generation supported for persona fleet operations. SD WebUI must be running and accessible at configured endpoint. UNLEASHED gate.

phantom --gate UNLEASHED deepfake "professional security researcher headshot, male, 35-45"
phantom --gate UNLEASHED deepfake "female data scientist, diverse, professional, LinkedIn profile photo" --count 3
phantom --gate UNLEASHED deepfake "abstract avatar, no face" --output /tmp/phantom-avatars/

08 — SPEAR-PHISH

AI-personalised spear phishing lure generation via claude-sonnet-4-6. Ingests RECON target intelligence package (post history, interests, professional background, network connections, platform activity patterns) and generates platform-appropriate phishing messages. Lure types per platform: Twitter DM (short, high-urgency), LinkedIn message (professional context, conference invitation, job offer), Instagram DM (visual hook reference), Reddit chat (community/subreddit hook). WMD class: synthetic_identity_deployment. UNLEASHED gate.

phantom --gate UNLEASHED spear-phish TARGET_USERNAME --platform linkedin
phantom --gate UNLEASHED spear-phish TARGET_USERNAME --platform twitter --lure-type job_offer
phantom --gate UNLEASHED spear-phish TARGET_USERNAME --platform instagram --persona researcher_1 --dry-run

09 — SABOTAGE-ACCOUNT

Account destruction engine. Three escalating destruction tiers. Requires valid session token from SESSION-HIJACK subsystem. All operations are irreversible once executed at full_lockout level. DESTROY gate required. --confirm-account-destruction flag must be explicitly passed — cannot be aliased.

Action	Effect	Reversibility
email_change	Changes account email address to attacker-controlled address	Recoverable if original email still accessible
password_change	Changes account password	Recoverable if recovery codes intact
full_lockout	Email change + password change + recovery code revocation	Irreversible — no recovery path

phantom --gate DESTROY sabotage instagram --action email_change \
  --confirm-account-destruction --live

phantom --gate DESTROY sabotage twitter --action password_change \
  --confirm-account-destruction --live

phantom --gate DESTROY sabotage instagram --action full_lockout \
  --confirm-account-destruction --live

DESTROY gate: full_lockout is irreversible. The account owner has no recovery path once email, password, and recovery codes are changed simultaneously. This operation requires explicit written authorisation in your ROE document. Red Specter Security Research Ltd accepts no liability for unauthorised use.

10 — REPORT

Ed25519-signed PHA-{hex12} reports. Every operation produces a cryptographically signed evidence record. Report fields: tool version, operator key fingerprint, gate level, target platforms, subsystems executed, findings per subsystem, blast radius calculation (accounts compromised, corpus poisoning estimated reach, influence campaign impressions, spear phish delivery count), MITRE ATLAS/OWASP mappings, financial impact estimate. Output: JSON and Markdown.

phantom --gate OPEN report --target TARGET_USERNAME --format markdown
phantom --gate INJECT report --target TARGET_USERNAME --format json --sign
phantom --gate DESTROY report --target TARGET_USERNAME --include-evidence --sign

Full Kill Chain: ANNIHILATE

The ANNIHILATE command executes the full kill chain: RECON → SESSION-HIJACK → INJECT-SOCIAL → PERSONA-ENGINE → INFLUENCE → POISON-CORPUS → DEEPFAKE → SPEAR-PHISH → SABOTAGE-ACCOUNT → REPORT. Produces a single Ed25519-signed PHA-{hex12} report covering all subsystem outputs.

# Full kill chain — DESTROY gate required for SABOTAGE-ACCOUNT
phantom --gate DESTROY annihilate TARGET_USERNAME \
  --platforms instagram,twitter,reddit \
  --confirm-account-destruction \
  --output /tmp/phantom-results/

CLI Reference

Command	Gate	Description
phantom --gate OPEN recon USERNAME	OPEN	Harvest target profile across all platforms
phantom --gate INJECT session-hijack	INJECT	Harvest OAuth tokens from browser stores
phantom --gate INJECT inject PLATFORM --technique TECHNIQUE	INJECT	Deploy prompt injection payload
phantom --gate INJECT persona --platform PLATFORM --count N	INJECT	Generate synthetic personas
phantom --gate UNLEASHED influence PLATFORM --posts-file FILE	UNLEASHED	Run influence campaign
phantom --gate UNLEASHED poison TOPIC CLAIM --count N	UNLEASHED	Generate corpus poisoning posts
phantom --gate UNLEASHED deepfake "PROMPT"	UNLEASHED	Generate deepfake avatar via SD WebUI
phantom --gate UNLEASHED spear-phish USERNAME --platform PLATFORM	UNLEASHED	Generate AI-personalised phishing lure
phantom --gate DESTROY sabotage PLATFORM --action ACTION --confirm-account-destruction	DESTROY	Account destruction
phantom --gate DESTROY annihilate USERNAME --platforms PLATFORMS --confirm-account-destruction	DESTROY	Full kill chain
phantom report --target USERNAME --format FORMAT	Any	Generate Ed25519-signed PHA-{hex12} report

MITRE ATLAS & OWASP Mapping

Subsystem	MITRE ATLAS	OWASP LLM	WMD Class
INJECT-SOCIAL	AML.T0051 — LLM Prompt Injection	LLM01 — Prompt Injection	social_ai_agent_hijack
SESSION-HIJACK	AML.T0043 — Craft Adversarial Data	LLM06 — Sensitive Information Disclosure	social_ai_agent_hijack
PERSONA-ENGINE, DEEPFAKE	AML.T0054 — Manipulate Training Data	LLM08 — Excessive Agency	synthetic_identity_deployment
INFLUENCE, SPEAR-PHISH	AML.T0020 — Poison Training Data	LLM01, LLM08	synthetic_identity_deployment
POISON-CORPUS	AML.T0018 — Backdoor ML Model	LLM01 — Prompt Injection	corpus_poisoning
SABOTAGE-ACCOUNT	AML.T0043, AML.T0054	LLM06, LLM08	account_destruction

Platform Support

Platform	Session Format	INJECT-SOCIAL	SABOTAGE-ACCOUNT
Instagram	sessionid, csrftoken	Post caption injection	Full (email/password/recovery)
Twitter / X	auth_token, ct0	Tweet text injection	Full (email/password/recovery)
LinkedIn	li_at, JSESSIONID	Post content injection	Email + password
Reddit	reddit_session, token_v2	Comment/post injection	Email + password

Report Format

Every PHANTOM operation produces an Ed25519-signed PHA-{hex12} report. Report ID format: PHA- followed by 12 hex characters derived from the engagement hash. The Ed25519 signature covers the full JSON report body and is verified by the operator's public key. Reports are suitable for regulatory disclosure, legal proceedings, and AI Shield audit trail ingestion.

{
  "report_id": "PHA-a3f8c2d14e6b",
  "timestamp": "2026-05-25T14:30:00Z",
  "operator_key_fingerprint": "SHA256:...",
  "gate": "DESTROY",
  "target": "target_username",
  "platforms": ["instagram", "twitter", "reddit"],
  "subsystems_executed": ["RECON", "SESSION-HIJACK", "INJECT-SOCIAL", "SABOTAGE-ACCOUNT"],
  "wmd_classes_triggered": ["social_ai_agent_hijack", "account_destruction"],
  "blast_radius": {
    "accounts_compromised": 1,
    "corpus_poisoning_estimated_reach": 0,
    "influence_impressions": 0,
    "spear_phish_delivered": 0
  },
  "mitre_atlas": ["AML.T0043", "AML.T0051", "AML.T0054", "AML.T0018", "AML.T0020"],
  "owasp_llm": ["LLM01", "LLM06", "LLM08"],
  "signature": "Ed25519:..."
}