Guide 43

Defending Against Memory Poisoning — The OWASP MCP10 Defense Stack

Problem Framing

Prompt injection was the agent-security story of 2023–2024. The defenses worked because the threat was stateless — attacks were observed in the same session they arrived. Memory poisoning breaks that assumption. A malicious instruction reaches the agent via an external data source (compromised PDF, manipulated email, poisoned KB article), gets written to long-term semantic memory, and fires weeks or months later at retrieval time, with the credibility of "learned" behavior. Input sanitization happens upstream of the memory write; output validation happens downstream of the retrieval; the attack bypasses both. The defense substrate has to move into the memory layer itself.

Relevant Nodes

Topics: AI Memory Governance, AI Memory Infrastructure
Standards: OWASP MCP Top 10
Architectures: Agent Memory Guard, Animesis CMA (Constitutional Memory Architecture), Memory Governance and Quality, Memory Lifecycle Management
Pain Points: Memory Poisoning, Context Injection & Over-Sharing (MCP10), Confused Deputy Problem (MCP)

Decision Path

Treat every MCP server as a hostile trust boundary. OWASP MCP Top 10's foundational directive. Dynamic tool discovery means an agent can be talking to any reachable endpoint; traditional perimeter trust models do not apply. Block ad-hoc MCP-server discovery via an MCP Gateway (see Guide 46) so only sanctioned servers can register.
Apply the layer-specific defense for each OWASP MCP risk class:
- MCP01 (Token Mismanagement) + MCP04 (Supply Chain) + MCP09 (Shadow MCP Servers): Gateway-tier, registry-tier. Centralized credential vault; SBOM scanning of MCP server packages; gateway whitelist of approved servers.
- MCP02 (Privilege Escalation) + MCP07 (Insufficient Auth): OAuth 2.1 with per-tenant client credentials, Cross-App Access (XAA), Workload Identity Federation. Never static proxy Client IDs (see Confused Deputy below).
- MCP03 (Tool Poisoning) + MCP05 (Command Injection) + MCP06 (Intent Flow Subversion): Runtime-tier sandboxing, output schema validation, structured-output mode. The tool definition is itself part of the attack surface — validate schemas at gateway boundary.
- MCP08 (Lack of Audit & Telemetry): End-to-end provenance — every agent action traced back to the MCP server + tool + user identity that authorized it. Langfuse / OpenTelemetry integration at the gateway.
- MCP10 (Context Injection & Over-Sharing): Memory-tier defense. Agent Memory Guard as the architectural pattern. Reject instruction-shaped patterns at memory-write time. Strict per-tenant + per-session isolation of vector stores. TTL + auto-purge for context buffers.
Defeat the Confused Deputy. Any MCP proxy or gateway that connects to downstream APIs with a static Client ID is exploitable. Move to per-client downstream credentials; require explicit consent-flow attestation; isolate dynamic-client-registration from privileged downstream access. The pattern dates to Norman Hardy 1988; MCP's federated architecture brought it back as a first-class concern.
Architect memory governance with the Animesis CMA layering. Even if you don't formally adopt Constitutional Memory Architecture, the layering — Constitution (immutable identity) + Core (curated long-term) + Peripheral (prunable working) + Raw Event Log (auditable) — gives you natural defense-in-depth. Poisoned content in the Peripheral layer cannot escalate to Core without a governed promotion step.
Design the deletion path before the regulator asks for it. Forgetting-as-a-Service primitives (gradient-based unlearning, pruning, Model Deletion Proofs) must be present from day one; bolt-on deletion never satisfies an audit.

What Changed Over Time

2024: Prompt injection was the dominant agent-security story. Memory was assumed ephemeral.
Early 2025: First documented memory-poisoning incidents in production agent deployments.
Late 2025 → 2026: OWASP launched the MCP Top 10. NSA published the MCP Security CSI advisory. The defense substrate moved into the memory layer (Agent Memory Guard architecture).
Forward: Constitutional Memory Architecture adoption in regulated verticals; cryptographic Model Deletion Proofs as a compliance primitive.

Problem Framing

Relevant Nodes

Decision Path

What Changed Over Time

Sources