Iceberg V3 Spec
The 2025 evolution of the Apache Iceberg table specification, introducing Row Lineage for row-level provenance tracking, native CDC detection, enhanced deletion handling, and metadata designed to make the lakehouse "agent-ready" for AI systems.
Summary
The 2025 evolution of the Apache Iceberg table specification, introducing Row Lineage for row-level provenance tracking, native CDC detection, enhanced deletion handling, and metadata designed to make the lakehouse "agent-ready" for AI systems.
As Iceberg becomes the dominant lakehouse format, V3 addresses the gaps that emerged at scale: Row Lineage exposes where each row originated and how it was transformed, native CDC detection eliminates external change tracking, and improved deletion vectors support streaming updates. V3 is the spec that makes Iceberg both batch/streaming-capable and AI-agent-readable.
- Engine support for V3 features is not immediate. Query engines need time to implement Row Lineage and native CDC; check engine compatibility before depending on V3-specific capabilities.
- V3 is backwards-compatible with V2 data. Upgrading the spec version does not require rewriting existing tables.
- "Agent-ready" refers to metadata granularity, not an AI integration layer. V3 exposes provenance metadata that AI systems can consume, but does not include built-in agent APIs.
extendsIceberg Table Spec — evolutionary improvement to the existing standardenablesApache Iceberg — new capabilities for Iceberg implementationsscoped_toTable Formats, S3
Definition
The 2025–2026 evolution of the Apache Iceberg table specification. V3 introduces four substantive changes: **Row Lineage** (every row carries a unique row ID and a sequence number that timestamps its last modification, enabling zero-scan incremental reads), **Deletion Vectors** (Puffin-encoded Roaring bitmaps that mark logically deleted positions instead of rewriting whole Parquet files — up to 10× faster MERGE/UPDATE), **native CDC detection**, and the **VARIANT data type** for shredded semi-structured payloads (nested JSON, IoT telemetry, application logs stored alongside strict relational columns with columnar-equivalent scan performance). V3 reached **Public Preview in Snowflake (March 2026)** and entered **bidirectional interop with Databricks Unity Catalog** the same quarter; AWS announced support for v3 deletion vectors and row lineage in November 2025.
V2 revealed three structural limits at scale: copy-on-write for any update made CDC pipelines economically punishing, lack of row provenance forced full-table scans for incremental processing, and the strict relational schema required separate normalization ETL for any semi-structured ingest. V3 addresses all three at the spec layer so engines (Spark, Trino, Flink, Athena, Snowflake, Databricks) inherit the gains without bespoke patches.
Row-level data lineage for compliance and AI provenance, native CDC detection in Iceberg tables, high-frequency MERGE/UPDATE workloads via deletion vectors, querying semi-structured payloads (JSON, telemetry) without normalization ETL, agent-ready metadata exposure.
Recent developments
- Iceberg 1.11.0 (May 19, 2026) marks V3 production-ready across upstream Apache, not just vendor previews. The release finalizes the File Format API — a consistent FormatModel/FormatModelRegistry plugin layer for all file formats — and promotes the full V3 feature set to stable: deletion vectors, the Variant type with shredding, native geospatial GEOMETRY/GEOGRAPHY first-class types, and nanosecond-precision timestamps. These are no longer experimental flags but the supported path for workloads that outgrow V2. Per Announcing Apache Iceberg 1.11.0 (Google Open Source Blog).
- Snowflake V3 Public Preview (March 4, 2026). Snowflake announced support for Apache Iceberg V3 in public preview — deletion vectors, row lineage, VARIANT, geospatial types all available for external-engine reads via Horizon Iceberg REST Catalog. Per Snowflake Docs — Support for Apache Iceberg V3 (Preview) March 4, 2026.
- Databricks V3 Public Preview on Unity Catalog Managed Iceberg. Unity Catalog Managed Iceberg v3 tables support Row Lineage, Deletion Vectors, and VARIANT in public preview — closing the "Delta-only feature parity" gap on the Iceberg side. Per Databricks Blog — The Next Era of the Open Lakehouse: Apache Iceberg v3 Public Preview.
- "No more tradeoff between Delta Lake performance features and Iceberg compatibility." Databricks' 2026 framing: Iceberg V3 + Deletion Vectors + Row Lineage + VARIANT structurally closes the performance gap that historically pushed teams onto Delta Lake for high-write workloads. Per Databricks Blog — Iceberg v3 Public Preview.
- Deletion vectors deliver up to 10× write-performance improvement. Tracking logical deletions with small delete files (Puffin-encoded Roaring bitmaps) — avoids large Parquet rewrites + reduces write amplification. The core performance argument for V3 over V2. Per Databricks Docs — Use Apache Iceberg v3 Features.
- Cross-engine federation: UC reads Iceberg in Snowflake / Glue / Salesforce / others. Unity Catalog open APIs support write-once-read-anywhere; UC can federate to other Iceberg catalogs for bi-directional interop. Iceberg V3 is the format substrate; cross-engine catalog federation is the access pattern. Per Databricks Blog — Announcing Full Apache Iceberg Support in Databricks.
- Data + AI Summit (June 15-18, 2026) is the V3 roadmap-disclosure venue. Databricks signaled the broader V3 roadmap will be detailed at the Data + AI Summit — worth tracking as the venue where V3 ships GA + future-spec direction lands. Per Databricks Blog — Iceberg v3 Public Preview.
Connections 9
Outbound 5
scoped_to2extends1enables1depends_on1Inbound 4
used_by1enables1competes_with1solved_by1Resources 2
Official Iceberg specification including V3 changes for row-level lineage, enhanced deletion tracking, and CDC support.
2025 year-in-review covering Iceberg V3 spec evolution, Polaris adoption, and ecosystem maturity.