Inner/Outer Harness Pattern
A design pattern for autonomous agents that explicitly separates the **inner harness** (the loop concerned with model behavior: prompt formatting, tool schema, response parsing, model choice, retry logic for individual LLM calls) from the **outer harness** (the loop concerned with infrastructure: durable execution, checkpoint persistence, failure recovery, suspension/resumption, observability across runs). Each harness has independent technology choices, independent change cadence, and independent operational concerns; the contract between them is a small set of step-boundary primitives.
Definition
A design pattern for autonomous agents that explicitly separates the **inner harness** (the loop concerned with model behavior: prompt formatting, tool schema, response parsing, model choice, retry logic for individual LLM calls) from the **outer harness** (the loop concerned with infrastructure: durable execution, checkpoint persistence, failure recovery, suspension/resumption, observability across runs). Each harness has independent technology choices, independent change cadence, and independent operational concerns; the contract between them is a small set of step-boundary primitives.
Most early agent frameworks mixed both concerns — LangChain pre-2024 was a single layer doing prompt assembly *and* retry policy *and* state management *and* infrastructure binding. The result was that swapping the model required changing the runtime, and swapping the runtime required rewriting the agent. The harness split makes each axis independently changeable: you can swap Claude for GPT-5 without touching durable-execution code; you can move from a single-machine runtime to Kitaru without rewriting the agent's tool-calling logic.
Recent developments
- The split is now consensus framing across agent ecosystems. Pydantic AI, LangGraph, AutoGen, crewAI all moved to an inner/outer-harness layering in their 2025-2026 releases. Per Pydantic — Runtime layer for Pydantic AI agents.
- Kitaru is canonically an outer-harness product. Doesn't dictate which agent SDK you use; provides only the durable-execution + step-boundary primitives. The inner-harness choice is yours. Per GitHub — zenml-io/kitaru.
- FAME formalizes the layer separation in a serverless context. Inner-harness logic lives in stateless functions; outer-harness state lives in DynamoDB + S3. Per arXiv 2601.14735 — FAME.
- Conflating the two layers is now classified as an anti-pattern. Reference architectures explicitly call out the "single layer doing both" mistake as the cause of most agent-system rewrites. Per Apple Podcasts — Agents are Just While Loops (MLOps.community).
Connections 7
Outbound 4
scoped_to1enables1complements1solves1