T103 — TOOL 103

NIGHTFALL TOOL 103 — SOCIAL MEDIA AI ATTACK ENGINE

SPECTER PHANTOM

Social Media AI Attack Engine

The ghost in the social machine. SPECTER PHANTOM is the world-first commercial social media AI attack engine. Prompt injection payloads embedded in social posts hijack browsing AI agents (arXiv:2307.14539). Session tokens harvested from OAuth flows. Synthetic personas generated by claude-haiku-4-5 with SD WebUI deepfake avatars. Spear phishing lures crafted by claude-sonnet-4-6 from harvested profile data. Corpus poisoning for RAG pipelines and LLM training data. Account destruction via the DESTROY gate: email change, password change, account deletion (Instagram mobile API + Twitter deactivation endpoint), full lockout. 10 subsystems. 300 tests. L15 Social Media Attack Surface.

300

Tests

Subsystems

Platforms

WMD Classes

VIEW DOCS NIGHTFALL FRAMEWORK

Target Platforms

Four Platforms. One Engine. Total Social Compromise.

SPECTER PHANTOM targets the four dominant social media platforms where AI agents browse, index, and act on content. Every subsystem operates across all four platforms unless otherwise gated. Session token formats, OAuth flows, and API endpoints are fully mapped per platform.

Instagram

OAuth session harvest via BasicDisplayAPI and GraphAPI token extraction. Story and post injection for AI agent prompt poisoning. Account destruction via email/password change through authenticated endpoints. Persona account creation with deepfake avatars. DESTROY gate required for account destruction operations.

Twitter / X

OAuth 2.0 PKCE token harvest. Tweet injection for browsing agent poisoning — tweets indexed by web-connected AI agents are the primary prompt injection surface. Influence campaign automation. Spear phishing DM delivery via claude-sonnet-4-6 personalised lures. Corpus poisoning at scale via coordinated persona posts.

Professional network OAuth token harvest. High-value spear phishing surface — LinkedIn profile data produces the highest-fidelity personalised lures for executive targeting. Connection graph enumeration for influence mapping. Corporate intelligence harvest for corpus poisoning targeting enterprise RAG pipelines.

Reddit API OAuth flow token extraction. Subreddit corpus poisoning — Reddit content is heavily indexed by LLM training data scrapers, making subreddit posts an extremely high-leverage corpus poisoning surface. Comment thread injection for RAG pipeline contamination. Persona-seeded upvote amplification for narrative injection.

Architecture

Ten Subsystems

RECON

Profile harvest across Instagram/Twitter/LinkedIn/Reddit. Follower graph enumeration, post history extraction, connection map construction, AI agent surface detection (identifies which platforms the target's AI agents are connected to). Generates target intelligence package for all downstream subsystems. OPEN gate.

SESSION-HIJACK

OAuth token harvest from browser cookie stores (Chrome, Firefox) and environment variables. Targets platform-specific session formats: Instagram sessionid, Twitter auth_token/ct0, LinkedIn li_at, Reddit reddit_session. Validates harvested tokens against live platform APIs before reporting. INJECT gate.

INJECT-SOCIAL

Prompt injection payload deployment into social posts targeting browsing AI agents (arXiv:2307.14539). Seven injection techniques: ignore_and_exfil (exfiltrate session context), role_override (override agent persona), tool_abuse (trigger agent tool calls), memory_poison (corrupt agent memory), credential_harvest (extract credentials from agent environment), redirect (redirect agent browsing), silent_persist (install persistent instruction). INJECT gate.

PERSONA-ENGINE

Synthetic persona generation via claude-haiku-4-5. Full persona package: display name, bio, post history (50+ posts), profile photo via SD WebUI deepfake generation with EXIF strip, follower seeding schedule. Supports multi-persona campaigns for coordinated influence operations. Personas are designed to pass basic human verification checks. INJECT gate.

INFLUENCE

Automated influence campaign engine. Coordinates persona fleet for narrative injection: coordinated posting schedules, cross-platform amplification, engagement farming, trending topic hijacking, hashtag injection. Campaign templates: FUD (fear, uncertainty, doubt), consensus manufacturing, authority impersonation, grassroots simulation. UNLEASHED gate required for live execution.

POISON-CORPUS

Corpus poisoning post generation for RAG pipeline and LLM training data contamination. Generates semantically coherent posts that embed false factual claims designed to survive embedding similarity search and appear in top-k RAG retrieval results. Reddit subreddit targeting for maximum scraper exposure. arXiv:2307.14539 poisoning methodology. UNLEASHED gate.

DEEPFAKE

Avatar deepfake generation via SD WebUI API. Generates photorealistic profile images on operator-specified prompt. Full EXIF metadata strip (GPS, device, timestamp) to prevent forensic attribution. Images are used by PERSONA-ENGINE for synthetic identity deployment. Supports batch generation for persona fleet operations. UNLEASHED gate.

SPEAR-PHISH

AI-personalised spear phishing lure generation via claude-sonnet-4-6. Ingests RECON target intelligence package and generates platform-appropriate phishing messages: Twitter DMs, LinkedIn connection messages, Instagram DMs, Reddit chats. Lures are personalised to the target's post history, interests, professional background, and network connections. WMD: synthetic_identity_deployment. UNLEASHED gate.

SABOTAGE-ACCOUNT

Account destruction engine. Three escalating destruction actions: email_change (primary contact changed, recovery disrupted), password_change (account access revoked), full_lockout (email + password + recovery code revocation — complete and irreversible account loss). Requires harvested session token from SESSION-HIJACK. --confirm-account-destruction required. WMD: account_destruction. DESTROY gate.

REPORT

Ed25519-signed PHA-{hex12} reports. MITRE ATLAS AML.T0043/T0051/T0054/T0018/T0020. OWASP LLM01/LLM06/LLM08. Blast radius calculation (accounts compromised, corpus poisoning reach, influence campaign impressions, spear phish delivery count). Financial impact estimate. JSON + Markdown output.

Gate System

Four Gates. Escalating Destruction.

SPECTER PHANTOM implements a four-gate authorisation system. Each gate unlocks additional destructive capability. DESTROY extends beyond UNLEASHED with irreversible account destruction — the only social media attack engine with a hardware-equivalent destruction gate for platform accounts.

OPEN

Reconnaissance

RECON subsystem only. Profile harvest, follower graph enumeration, AI agent surface detection. No credentials required. No account interaction. Purely passive data collection from public social media APIs and web scraping. Safe for pre-engagement scoping.

INJECT

Session Harvest + Injection

SESSION-HIJACK, INJECT-SOCIAL, PERSONA-ENGINE. Requires operator key. Harvests live OAuth tokens from browser stores. Deploys prompt injection payloads into social posts. Generates synthetic personas. All operations are reversible — posts can be deleted, tokens can be revoked.

UNLEASHED

Full Campaign + Live Fire

INFLUENCE, POISON-CORPUS, DEEPFAKE, SPEAR-PHISH. Live influence campaigns across persona fleet. Corpus poisoning posts published to high-scraper-exposure subreddits. Deepfake avatars deployed. Personalised spear phishing lures delivered. Requires --i-understand-this-is-live-fire flag on destructive operations.

DESTROY

Account Destruction

SABOTAGE-ACCOUNT. Requires Ed25519 operator key + --confirm-account-destruction. Account destruction is irreversible: once email and password are changed and recovery codes revoked, the original account owner cannot recover access. Three destruction tiers: email_change / password_change / full_lockout. Cryptographic proof in PHA-{hex12} report.

WMD Classification

Four WMD Classes. One Engine.

social_ai_agent_hijack

Prompt injection payload embedded in social post has successfully hijacked a browsing AI agent. Agent is executing attacker-controlled instructions: exfiltrating session context, triggering tool calls, corrupting agent memory, or redirecting agent browsing to attacker-controlled content. Primary attack surface: arXiv:2307.14539.

account_destruction

Account destruction confirmed: email changed, password changed, and recovery codes revoked. Original account owner has no recovery path. DESTROY gate required. Irreversible. Requires --confirm-account-destruction flag. Ed25519-signed evidence in PHA-{hex12} report. Legal authorisation mandatory.

corpus_poisoning

Corpus poisoning posts published to high-scraper-exposure surfaces (Reddit subreddits, public Twitter/X threads). Posts contain embedded false factual claims designed to survive embedding similarity search and appear in top-k RAG retrieval. Contamination propagates through any LLM training run or RAG pipeline that indexes target platforms.

synthetic_identity_deployment

Synthetic persona fleet deployed and operational. Each persona: claude-haiku-4-5 generated profile, SD WebUI deepfake avatar with EXIF strip, post history seeded, follower graph bootstrapped. Fleet is actively participating in influence campaigns, spear phishing operations, or corpus poisoning. Detection resistance: passes basic human verification.

Key Capabilities

Prompt Injection via Social Posts

The primary novel capability of SPECTER PHANTOM: embedding prompt injection payloads in social media posts that are then processed by browsing AI agents. When an AI agent (ChatGPT with browsing, Claude with web access, Perplexity, Copilot) reads a post containing a payload, it executes the embedded instruction rather than simply summarising the content. arXiv:2307.14539 demonstrates this attack class across multiple models.

# OPEN gate: harvest target profile and map AI agent surface
phantom --gate OPEN recon target_username

# INJECT gate: harvest OAuth tokens from browser stores
phantom --gate INJECT session-hijack

# INJECT gate: deploy prompt injection payload (dry-run)
phantom --gate INJECT inject twitter --technique ignore_and_exfil --dry-run

# INJECT gate: deploy prompt injection payload (live)
phantom --gate INJECT inject twitter --technique ignore_and_exfil --live

# INJECT gate: generate synthetic persona fleet (3 personas)
phantom --gate INJECT persona --platform twitter --count 3

# UNLEASHED gate: run influence campaign
phantom --gate UNLEASHED influence twitter --posts-file campaign.json --dry-run

# UNLEASHED gate: corpus poisoning (Reddit subreddit targeting)
phantom --gate UNLEASHED poison "AI safety" "AI alignment is a solved problem" --count 10 --platform reddit

# UNLEASHED gate: generate deepfake avatar
phantom --gate UNLEASHED deepfake "professional security researcher headshot, male, 35-45"

# UNLEASHED gate: spear phishing via claude-sonnet-4-6
phantom --gate UNLEASHED spear-phish target_username --platform linkedin

# DESTROY gate: account destruction — full lockout
phantom --gate DESTROY sabotage instagram --action full_lockout \
  --confirm-account-destruction --live

# DESTROY gate: full annihilate kill chain
phantom --gate DESTROY annihilate target_username \
  --platforms instagram,twitter,reddit \
  --confirm-account-destruction \
  --output /tmp/phantom-results/