Guide 43

Defending Against Memory Poisoning — The OWASP MCP10 Defense Stack

Problem Framing

Prompt injection was the agent-security story of 2023–2024. The defenses worked because the threat was stateless — attacks were observed in the same session they arrived. Memory poisoning breaks that assumption. A malicious instruction reaches the agent via an external data source (compromised PDF, manipulated email, poisoned KB article), gets written to long-term semantic memory, and fires weeks or months later at retrieval time, with the credibility of "learned" behavior. Input sanitization happens upstream of the memory write; output validation happens downstream of the retrieval; the attack bypasses both. The defense substrate has to move into the memory layer itself.

Relevant Nodes

  • Topics: AI Memory Governance, AI Memory Infrastructure
  • Standards: OWASP MCP Top 10
  • Architectures: Agent Memory Guard, Animesis CMA (Constitutional Memory Architecture), Memory Governance and Quality, Memory Lifecycle Management
  • Pain Points: Memory Poisoning, Context Injection & Over-Sharing (MCP10), Confused Deputy Problem (MCP)

Decision Path

  1. Treat every MCP server as a hostile trust boundary. OWASP MCP Top 10's foundational directive. Dynamic tool discovery means an agent can be talking to any reachable endpoint; traditional perimeter trust models do not apply. Block ad-hoc MCP-server discovery via an MCP Gateway (see Guide 46) so only sanctioned servers can register.

  2. Apply the layer-specific defense for each OWASP MCP risk class:

    • MCP01 (Token Mismanagement) + MCP04 (Supply Chain) + MCP09 (Shadow MCP Servers): Gateway-tier, registry-tier. Centralized credential vault; SBOM scanning of MCP server packages; gateway whitelist of approved servers.
    • MCP02 (Privilege Escalation) + MCP07 (Insufficient Auth): OAuth 2.1 with per-tenant client credentials, Cross-App Access (XAA), Workload Identity Federation. Never static proxy Client IDs (see Confused Deputy below).
    • MCP03 (Tool Poisoning) + MCP05 (Command Injection) + MCP06 (Intent Flow Subversion): Runtime-tier sandboxing, output schema validation, structured-output mode. The tool definition is itself part of the attack surface — validate schemas at gateway boundary.
    • MCP08 (Lack of Audit & Telemetry): End-to-end provenance — every agent action traced back to the MCP server + tool + user identity that authorized it. Langfuse / OpenTelemetry integration at the gateway.
    • MCP10 (Context Injection & Over-Sharing): Memory-tier defense. Agent Memory Guard as the architectural pattern. Reject instruction-shaped patterns at memory-write time. Strict per-tenant + per-session isolation of vector stores. TTL + auto-purge for context buffers.
  3. Defeat the Confused Deputy. Any MCP proxy or gateway that connects to downstream APIs with a static Client ID is exploitable. Move to per-client downstream credentials; require explicit consent-flow attestation; isolate dynamic-client-registration from privileged downstream access. The pattern dates to Norman Hardy 1988; MCP's federated architecture brought it back as a first-class concern.

  4. Architect memory governance with the Animesis CMA layering. Even if you don't formally adopt Constitutional Memory Architecture, the layering — Constitution (immutable identity) + Core (curated long-term) + Peripheral (prunable working) + Raw Event Log (auditable) — gives you natural defense-in-depth. Poisoned content in the Peripheral layer cannot escalate to Core without a governed promotion step.

  5. Design the deletion path before the regulator asks for it. Forgetting-as-a-Service primitives (gradient-based unlearning, pruning, Model Deletion Proofs) must be present from day one; bolt-on deletion never satisfies an audit.

What Changed Over Time

  • 2024: Prompt injection was the dominant agent-security story. Memory was assumed ephemeral.
  • Early 2025: First documented memory-poisoning incidents in production agent deployments.
  • Late 2025 → 2026: OWASP launched the MCP Top 10. NSA published the MCP Security CSI advisory. The defense substrate moved into the memory layer (Agent Memory Guard architecture).
  • Forward: Constitutional Memory Architecture adoption in regulated verticals; cryptographic Model Deletion Proofs as a compliance primitive.

Sources