Full-chain exploitation of vector databases and neurosymbolic DAG Knowledge Evaluation Graphs. VAULT targets the infrastructure every AI application trusts but no one secures. Five CVEs: Qdrant, Milvus, Weaviate, ChromaDB, pgvector. Vec2Text embedding inversion recovers PII and API keys from raw vectors. DAG-POISON injects false evidence edges — corrupted reasoning appears structurally sound because it inherits legitimacy from trusted neighbours. DAG-TRAVERSE maps hub nodes, critical paths, and orphan injection points with GraphViz output. DAG-INVERT reconstructs confidential decisions from Evidence Vectors with GDPR impact scoring. Eleven subsystems. One tool. UNLEASHED.
SPECTER VAULT targets the CVEs your security team didn't patch because "vector databases aren't production infrastructure." They are. And they contain your entire knowledge base.
| CVE | Database | Vulnerability | CVSS | Gate |
|---|---|---|---|---|
| CVE-2026-41705 | Milvus (Spring AI) | Unsanitised expr filter parameter — full collection dump via single POST | 9.0 | OPEN |
| CVE-2026-52891 | Qdrant | Unauthenticated scroll API — paginated bulk extraction of all vectors | 8.5 | OPEN |
| CVE-2026-49103 | Weaviate | Anonymous GraphQL with _additional{vector} — no credentials required | 7.8 | OPEN |
| CVE-2026-53012 | ChromaDB | __source_url__ metadata SSRF → cloud IMDS credential exfiltration | 7.5 | INJECT |
| CVE-2026-48821 | pgvector | COPY TO PROGRAM privilege escalation → full OS RCE as postgres user | 8.8 | UNLEASHED |
Port scan, DB type fingerprinting, collection enumeration, vector dimension inference, auth state detection. Five databases. No credentials required. OPEN gate.
Five CVE probes. Credential harvest from environment variables and config files. Confirms exploitability before HARVEST. INJECT gate for credential access.
Live CVE exploitation: Qdrant scroll dump, Milvus expr injection, Weaviate GraphQL traversal, ChromaDB SSRF to IMDS, pgvector COPY TO PROGRAM RCE. UNLEASHED for RCE.
Paginated bulk extraction of all vectors and payloads. Cursor-based pagination. Gzip-compressed JSONL with SHA-256 integrity. Handles millions of vectors. INJECT gate.
Vec2Text black-box embedding inversion (arXiv:2303.04246). 84% exact token match on ada-002. PII detection: email, phone, SSN, credit card. 18 secret patterns + Shannon entropy. INJECT gate.
Gradient-free adversarial vector generation places attacker payload at rank-1 for any target query. PROPAGATE flag spreads poison to backup and replica collections. INJECT gate.
Three modes: ZERO (permanent query invisibility), NOISE (systematic hallucination induction), WIPE (full knowledge base annihilation via scroll-and-replace). UNLEASHED gate.
Ed25519-signed VLT-{hex12} reports. Financial blast radius: re-embedding cost USD, GDPR liability USD, downtime hours. Neurosymbolic CVSS (RII/TPD/EC). Seven WMD classes.
Maps full neurosymbolic DAG Knowledge Graph attack surface. Hub nodes (highest in-degree), critical path nodes (betweenness centrality), orphan injection points. GraphViz DOT output. OPEN gate.
Four attack vectors: false edge injection (spurious evidence relationships), trust propagation abuse (malicious node inherits hub credibility), cycle injection (self-reinforcing false claim amplification), evidence weight manipulation. INJECT gate.
Evidence Vector reconstruction using extended neurosymbolic vocabulary. Recovers confidential decisions (medical triage, credit scoring, access control) from raw evidence vectors. GDPR risk: LOW/MEDIUM/HIGH/CRITICAL. INJECT gate.
Morris et al. (arXiv:2303.04246) demonstrated 84% exact token match recovery of original text from OpenAI ada-002 embeddings using only black-box API access. VAULT implements this. If your RAG corpus contains PII — patient records, user emails, internal communications — and an attacker can extract your embeddings, they can read the documents. Without ever touching your document store.
Greedy token substitution with cosine similarity oracle. No gradients required. No model access required. Just the embedding API and the target vector. Works against any model once dimension is known.
Regex detection across reconstructed text: email, phone, SSN, credit card, names. 18 API key patterns: OpenAI sk-proj-, AWS AKIA, GitHub ghp_, Anthropic sk-ant-, Slack tokens. Shannon entropy for unknown patterns.
GDPR Art. 83 liability: $150 per exposed PII record. Re-embedding cost: (vectors / 1000) × dim × $0.0001. Downtime: vectors / 50,000 hours re-ingestion. Every compromise quantified for the board.
ada-002 (1536-dim): 84% exact token match. BGE-base (768-dim): ~60%. MiniLM (384-dim): ~45%. Dimension fingerprinting identifies the model before inversion begins — matching oracle maximises accuracy.
SPECTER VAULT actions that cross the WMD threshold require dual UNLEASHED gate clearance — Ed25519-signed operator scope with explicit target authorisation.
Bulk extraction exceeding 10,000 vectors. Complete knowledge base exfiltration. Covers Qdrant CVE-2026-52891, Milvus CVE-2026-41705, Weaviate CVE-2026-49103, full HARVEST operations.
Vec2Text reconstruction recovering PII or secrets from embeddings. Triggers when ≥100 PII instances detected or any API key recovered. GDPR breach-class event.
Adversarial vector injection, noise overwrite, or zero overwrite of vector collections. Any CORRUPT operation or successful POISON propagation across collections.
pgvector CVE-2026-48821 COPY TO PROGRAM escalation to OS shell. Full host compromise from database write access. Requires UNLEASHED gate + explicit pg_execute_server_program confirmation.
Persistent false evidence injection into neurosymbolic DAG KG corrupting reasoning chains. False edges make unsubstantiated claims appear structurally sound through trust inheritance from legitimate nodes. MITRE AML.T0022.
Malicious trust inheritance via edge injection adjacent to high-trust nodes. Trust score >0.5 inherited or cycle detected. Cycle injection creates self-reinforcing false claim amplification. MITRE AML.T0022.
Reconstruction of confidential decisions and PII from raw DAG evidence vectors. HIGH/CRITICAL GDPR risk from recovered medical decisions, credit scoring outcomes, or access control verdicts. MITRE AML.T0037/T0054.