Definition

What it is

An open-source **durable runtime** for AI agents from ZenML, designed as the "outer harness" that sits underneath any agent SDK (Pydantic AI, LangGraph, LlamaIndex, custom) and provides transparent checkpointing, replayable state, asynchronous suspension, and S3-backed artifact persistence. Kitaru explicitly separates the agent's *inner* concerns (prompt shape, tool selection, model choice) from its *outer* concerns (failure recovery, resumability, infrastructure binding) — letting an agent survive Kubernetes pod evictions, function timeouts, and downstream API failures without losing progress.

Why it exists

Long-running autonomous agents are fundamentally a recursive while-loop interacting with stateful tools — a fragility profile no stateless microservice framework was designed for. Pre-Kitaru, agent developers either built bespoke checkpoint/replay logic on top of every framework or accepted that any pod eviction at step 11 of 12 burned the entire run. Kitaru makes the durable-execution primitive a reusable layer: each step boundary persists inputs + intermediate outputs + LLM responses to S3, and on failure the system resumes from the last successful boundary, not from zero.

Primary use cases

Long-running agents in regulated environments (compliance auto-audit, multi-day research synthesis, ETL agents) where rerunning is expensive or impossible; human-in-the-loop workflows where the agent must suspend for hours-to-days waiting for approval; multi-agent fan-out where the orchestrator needs to pause until secondary agents complete; replay-debugging for non-deterministic agent failures.

Recent developments

Latest signals

Kitaru repo public under ZenML. Active development, reference Python decorator API + S3-backed artifact store, Pydantic AI integration shipped as the flagship example. Per GitHub — zenml-io/kitaru.
Positioned against Restate as the agent-shaped durable runtime. ZenML's comparison frames Restate as optimized for traditional workflows (strongly consistent virtual objects, journaled event logs) and Kitaru as agent-optimized (versioned artifact storage, artifact lineage, pause/resume aligned with LLM generation cycles). Per ZenML — Kitaru vs Restate.
Pydantic AI integration documented end-to-end. Pydantic published a long-form article on Kitaru as the durable runtime layer under Pydantic AI agents. Per Pydantic — Runtime layer for Pydantic AI agents.
Artifact-versioned failure semantics. An interrupted run becomes a versioned addressable S3 artifact, not a stack trace — developers inspect intermediate outputs, correct faulty assumptions, override a checkpoint, and resume downstream. Per ZenML — Kitaru product page.