Guide 21

Iceberg v3 in Production — Deletion Vectors and Row Lineage

Problem Framing

Iceberg v2 positional delete files cause significant read amplification during query execution: engines must join data files against delete files at query time to exclude deleted rows. Under CDC-heavy upsert workloads, this join cost grows linearly with the number of uncompacted delete files. Iceberg v3 replaces positional deletes with binary deletion vectors stored in Puffin files and adds row lineage tracking for fine-grained CDC audit trails. Engineers need to understand the migration path from v2 to v3 and the operational implications for their engine stack.

Relevant Nodes

  • Topics: S3, Table Formats
  • Technologies: Apache Iceberg
  • Standards: Iceberg V3 Spec
  • Architectures: Deletion Vector, CDC into Lakehouse, Compaction
  • Pain Points: Read / Write Amplification, Schema Evolution

Decision Path

  1. Understand positional deletes vs. deletion vectors. Positional deletes in v2 are stored as separate Parquet files mapping (file_path, position) pairs. Each query must join these against data files. Deletion vectors in v3 are compact bitmaps stored in Puffin files — one bitmap per data file, one bit per row. The bitmap lookup is O(1) per row instead of a join, eliminating the read amplification caused by delete file accumulation.

  2. Assess v3 readiness in your engine stack. Not all engines support v3 features simultaneously. As of early 2026:

    • Spark 4.x supports v3 deletion vectors and row lineage natively.
    • Trino has partial v3 support — deletion vector reads are supported, row lineage writes may lag.
    • Flink Iceberg connector support depends on the connector version and may trail the spec.
    • DuckDB reads v3 tables via the Iceberg extension but may not write deletion vectors.
  3. Plan the v2-to-v3 metadata migration. Upgrading a table's format version is a metadata-only operation — no data files are rewritten. However, once upgraded to v3, older engines that lack v3 support cannot read the table. Coordinate engine upgrades before migrating production tables.

    • Test in a staging environment with a snapshot of production metadata.
    • Roll out engine upgrades first, then upgrade table format version.
  4. Configure Puffin file compaction. Deletion vectors accumulate as individual Puffin files. Compaction merges these into the data files by rewriting rows, producing clean data files with no associated deletion vectors. Configure compaction frequency to balance write amplification against read performance.

  5. Implement row lineage for CDC audit trails. v3 row lineage assigns a monotonically increasing sequence number to each row modification, enabling downstream consumers to reconstruct the mutation history of any row. This is valuable for regulatory audit, debugging CDC pipelines, and change replay.

    • Row lineage adds metadata overhead per row — evaluate storage impact at your scale.
  6. Monitor read amplification reduction. After migration, track the ratio of deletion vector files to data files. A healthy ratio stays low (under 10%). Compaction should keep deletion vectors from accumulating beyond this threshold.

What Changed Over Time

  • Iceberg v1 supported only full file rewrites for deletes — any row-level delete required rewriting the entire data file.
  • v2 (2022) introduced positional deletes, enabling row-level operations without full rewrites but creating a read amplification problem under high-churn workloads.
  • v3 (spec ratified 2024, engine support rolling out 2025–2026) replaces positional deletes with deletion vectors, aligning Iceberg with the approach Delta Lake adopted earlier via its own deletion vector implementation.
  • Row lineage in v3 is a new capability with no direct precedent in earlier versions — it reflects the growing demand for fine-grained audit in regulated data pipelines.
  • The v3 migration is metadata-only, but the ecosystem fragmentation (engines at different support levels) means production rollout requires careful coordination.

Sources