Project Nessie
An open-source transactional catalog for data lakes that provides Git-like branching, tagging, and commit semantics for Iceberg table metadata, enabling isolated experimentation and atomic multi-table operations.
Summary
An open-source transactional catalog for data lakes that provides Git-like branching, tagging, and commit semantics for Iceberg table metadata, enabling isolated experimentation and atomic multi-table operations.
Nessie sits in the catalog layer between query engines and S3-stored Iceberg tables. Unlike Hive Metastore or Glue Catalog, Nessie tracks table state as a history of commits, enabling branch-based workflows (test a schema change on a branch, merge when validated) without duplicating data on S3.
- Nessie branches do not copy data files on S3. Branches are lightweight metadata pointers. Only the metadata (table snapshots, schema) is versioned; data files are shared across branches via copy-on-write semantics.
- Nessie is a catalog, not a query engine. It must be integrated with Spark, Flink, Trino, or Dremio to execute queries.
- Merge conflicts in Nessie follow table-level semantics. Concurrent modifications to the same table on different branches require explicit conflict resolution.
scoped_toMetadata Management, Data Versioning — Git-like catalog for table metadataenablesApache Iceberg — serves as an Iceberg catalog with branchingenablesBranching / Tagging — the architectural pattern Nessie implementsalternative_toAWS Glue Catalog, Hive Metastore — catalog with version control semantics
Definition
An open-source transactional catalog for data lakes that provides Git-like branching and tagging semantics for Iceberg tables stored on S3. Enables isolated experimentation on production datasets without copying data.
Traditional catalogs (Hive Metastore, Glue) offer no branching or isolation — every change is immediately visible to all consumers. Nessie adds Git-like version control to table metadata, enabling safe experimentation, rollback, and multi-table atomic commits.
Branched experimentation on Iceberg tables, multi-table atomic commits, catalog-level versioning and rollback.
Connections 9
Outbound 7
scoped_to3implements1enables1solves1used_by1Inbound 2
enables1depends_on1Resources 3
Official Project Nessie site for the Git-like transactional catalog providing branching, tagging, and commit history for Iceberg tables on S3.
Nessie source repository with the catalog server, CLI tools, and integrations for Spark, Flink, and Dremio.
Nessie API and configuration reference covering branch management, merge operations, and multi-table transactions.