The ghost in the social machine. SPECTER PHANTOM is the world-first commercial social media AI attack engine. Prompt injection payloads embedded in social posts hijack browsing AI agents (arXiv:2307.14539). Session tokens harvested from OAuth flows. Synthetic personas generated by claude-haiku-4-5 with SD WebUI deepfake avatars. Spear phishing lures crafted by claude-sonnet-4-6 from harvested profile data. Corpus poisoning for RAG pipelines and LLM training data. Account destruction via the DESTROY gate: email change, password change, account deletion (Instagram mobile API + Twitter deactivation endpoint), full lockout. 10 subsystems. 300 tests. L15 Social Media Attack Surface.
SPECTER PHANTOM targets the four dominant social media platforms where AI agents browse, index, and act on content. Every subsystem operates across all four platforms unless otherwise gated. Session token formats, OAuth flows, and API endpoints are fully mapped per platform.
OAuth session harvest via BasicDisplayAPI and GraphAPI token extraction. Story and post injection for AI agent prompt poisoning. Account destruction via email/password change through authenticated endpoints. Persona account creation with deepfake avatars. DESTROY gate required for account destruction operations.
OAuth 2.0 PKCE token harvest. Tweet injection for browsing agent poisoning — tweets indexed by web-connected AI agents are the primary prompt injection surface. Influence campaign automation. Spear phishing DM delivery via claude-sonnet-4-6 personalised lures. Corpus poisoning at scale via coordinated persona posts.
Professional network OAuth token harvest. High-value spear phishing surface — LinkedIn profile data produces the highest-fidelity personalised lures for executive targeting. Connection graph enumeration for influence mapping. Corporate intelligence harvest for corpus poisoning targeting enterprise RAG pipelines.
Reddit API OAuth flow token extraction. Subreddit corpus poisoning — Reddit content is heavily indexed by LLM training data scrapers, making subreddit posts an extremely high-leverage corpus poisoning surface. Comment thread injection for RAG pipeline contamination. Persona-seeded upvote amplification for narrative injection.
Profile harvest across Instagram/Twitter/LinkedIn/Reddit. Follower graph enumeration, post history extraction, connection map construction, AI agent surface detection (identifies which platforms the target's AI agents are connected to). Generates target intelligence package for all downstream subsystems. OPEN gate.
OAuth token harvest from browser cookie stores (Chrome, Firefox) and environment variables. Targets platform-specific session formats: Instagram sessionid, Twitter auth_token/ct0, LinkedIn li_at, Reddit reddit_session. Validates harvested tokens against live platform APIs before reporting. INJECT gate.
Prompt injection payload deployment into social posts targeting browsing AI agents (arXiv:2307.14539). Seven injection techniques: ignore_and_exfil (exfiltrate session context), role_override (override agent persona), tool_abuse (trigger agent tool calls), memory_poison (corrupt agent memory), credential_harvest (extract credentials from agent environment), redirect (redirect agent browsing), silent_persist (install persistent instruction). INJECT gate.
Synthetic persona generation via claude-haiku-4-5. Full persona package: display name, bio, post history (50+ posts), profile photo via SD WebUI deepfake generation with EXIF strip, follower seeding schedule. Supports multi-persona campaigns for coordinated influence operations. Personas are designed to pass basic human verification checks. INJECT gate.
Automated influence campaign engine. Coordinates persona fleet for narrative injection: coordinated posting schedules, cross-platform amplification, engagement farming, trending topic hijacking, hashtag injection. Campaign templates: FUD (fear, uncertainty, doubt), consensus manufacturing, authority impersonation, grassroots simulation. UNLEASHED gate required for live execution.
Corpus poisoning post generation for RAG pipeline and LLM training data contamination. Generates semantically coherent posts that embed false factual claims designed to survive embedding similarity search and appear in top-k RAG retrieval results. Reddit subreddit targeting for maximum scraper exposure. arXiv:2307.14539 poisoning methodology. UNLEASHED gate.
Avatar deepfake generation via SD WebUI API. Generates photorealistic profile images on operator-specified prompt. Full EXIF metadata strip (GPS, device, timestamp) to prevent forensic attribution. Images are used by PERSONA-ENGINE for synthetic identity deployment. Supports batch generation for persona fleet operations. UNLEASHED gate.
AI-personalised spear phishing lure generation via claude-sonnet-4-6. Ingests RECON target intelligence package and generates platform-appropriate phishing messages: Twitter DMs, LinkedIn connection messages, Instagram DMs, Reddit chats. Lures are personalised to the target's post history, interests, professional background, and network connections. WMD: synthetic_identity_deployment. UNLEASHED gate.
Account destruction engine. Three escalating destruction actions: email_change (primary contact changed, recovery disrupted), password_change (account access revoked), full_lockout (email + password + recovery code revocation — complete and irreversible account loss). Requires harvested session token from SESSION-HIJACK. --confirm-account-destruction required. WMD: account_destruction. DESTROY gate.
Ed25519-signed PHA-{hex12} reports. MITRE ATLAS AML.T0043/T0051/T0054/T0018/T0020. OWASP LLM01/LLM06/LLM08. Blast radius calculation (accounts compromised, corpus poisoning reach, influence campaign impressions, spear phish delivery count). Financial impact estimate. JSON + Markdown output.
SPECTER PHANTOM implements a four-gate authorisation system. Each gate unlocks additional destructive capability. DESTROY extends beyond UNLEASHED with irreversible account destruction — the only social media attack engine with a hardware-equivalent destruction gate for platform accounts.
RECON subsystem only. Profile harvest, follower graph enumeration, AI agent surface detection. No credentials required. No account interaction. Purely passive data collection from public social media APIs and web scraping. Safe for pre-engagement scoping.
SESSION-HIJACK, INJECT-SOCIAL, PERSONA-ENGINE. Requires operator key. Harvests live OAuth tokens from browser stores. Deploys prompt injection payloads into social posts. Generates synthetic personas. All operations are reversible — posts can be deleted, tokens can be revoked.
INFLUENCE, POISON-CORPUS, DEEPFAKE, SPEAR-PHISH. Live influence campaigns across persona fleet. Corpus poisoning posts published to high-scraper-exposure subreddits. Deepfake avatars deployed. Personalised spear phishing lures delivered. Requires --i-understand-this-is-live-fire flag on destructive operations.
SABOTAGE-ACCOUNT. Requires Ed25519 operator key + --confirm-account-destruction. Account destruction is irreversible: once email and password are changed and recovery codes revoked, the original account owner cannot recover access. Three destruction tiers: email_change / password_change / full_lockout. Cryptographic proof in PHA-{hex12} report.
Prompt injection payload embedded in social post has successfully hijacked a browsing AI agent. Agent is executing attacker-controlled instructions: exfiltrating session context, triggering tool calls, corrupting agent memory, or redirecting agent browsing to attacker-controlled content. Primary attack surface: arXiv:2307.14539.
Account destruction confirmed: email changed, password changed, and recovery codes revoked. Original account owner has no recovery path. DESTROY gate required. Irreversible. Requires --confirm-account-destruction flag. Ed25519-signed evidence in PHA-{hex12} report. Legal authorisation mandatory.
Corpus poisoning posts published to high-scraper-exposure surfaces (Reddit subreddits, public Twitter/X threads). Posts contain embedded false factual claims designed to survive embedding similarity search and appear in top-k RAG retrieval. Contamination propagates through any LLM training run or RAG pipeline that indexes target platforms.
Synthetic persona fleet deployed and operational. Each persona: claude-haiku-4-5 generated profile, SD WebUI deepfake avatar with EXIF strip, post history seeded, follower graph bootstrapped. Fleet is actively participating in influence campaigns, spear phishing operations, or corpus poisoning. Detection resistance: passes basic human verification.
The primary novel capability of SPECTER PHANTOM: embedding prompt injection payloads in social media posts that are then processed by browsing AI agents. When an AI agent (ChatGPT with browsing, Claude with web access, Perplexity, Copilot) reads a post containing a payload, it executes the embedded instruction rather than simply summarising the content. arXiv:2307.14539 demonstrates this attack class across multiple models.