Standard

Delta Lake Protocol

Summary

What it is

The specification for ACID transaction logs over Parquet files on object storage. Defines how writes, deletes, and schema changes are recorded in a JSON-based commit log stored alongside data files.

Where it fits

The Delta protocol is what makes Delta Lake tables transactional. The commit log serializes changes so concurrent readers and writers see consistent state — even on S3, where atomic rename is unavailable.

Misconceptions / Traps

  • The Delta protocol requires either atomic rename or an external coordination mechanism (DynamoDB, Azure ADLS). On S3, multi-cluster writes are unsafe without a log store.
  • Protocol versions (reader/writer features) must be managed carefully. Upgrading to a newer protocol version may make older readers unable to open the table.

Key Connections

  • enables Lakehouse Architecture — the spec that makes Delta Lake ACID possible
  • solves Schema Evolution — schema enforcement in the transaction log
  • Delta Lake depends_on Delta Lake Protocol
  • scoped_to Table Formats, Lakehouse

Definition

What it is

A specification for ACID transaction logs over Parquet files on object storage. Defines how writes, deletes, and schema changes are recorded in a JSON-based transaction log stored alongside data files.

Why it exists

To bring database-like reliability to data lakes. The Delta protocol ensures that concurrent readers and writers see consistent table state, even on eventually consistent storage, by serializing changes through a commit log.

Primary use cases

ACID-compliant data lake tables, streaming + batch unification on S3, audit-trail via transaction log history.

Relationships

Outbound Relationships

Inbound Relationships

depends_on

Resources