Standard

Iceberg REST Catalog Spec

An open REST API specification for Apache Iceberg catalog operations — namespace/table listing, metadata load, commit, snapshot management — enabling multi-engine interoperability through a standardized HTTP-based catalog interface. Extended in practice with **credential vending**, where the catalog mints prefix-scoped, short-lived S3 credentials at table-load time.

15 connections 6 resources 2 posts

Summary

What it is

An open REST API specification for Apache Iceberg catalog operations — namespace/table listing, metadata load, commit, snapshot management — enabling multi-engine interoperability through a standardized HTTP-based catalog interface. Extended in practice with **credential vending**, where the catalog mints prefix-scoped, short-lived S3 credentials at table-load time.

Where it fits

The REST Catalog Spec solves the catalog fragmentation problem in the Iceberg ecosystem. Instead of every engine needing native support for Hive Metastore, Glue, Nessie, etc., any catalog that implements the REST spec becomes accessible to all REST-capable engines. This is also the wire that lets a local-first engine like DuckDB attach directly to an Amazon S3 Tables bucket (`ATTACH '<arn>' AS cat (TYPE iceberg, ENDPOINT_TYPE s3_tables)`) or metadata-clone via `iceberg_to_ducklake(...)` — bypassing heavy distributed compute for interactive querying of multi-terabyte remote tables.

Misconceptions / Traps
  • The REST Catalog Spec defines the API contract, not the catalog implementation. Performance, consistency, and feature completeness depend on the catalog server behind the API.
  • Not all Iceberg catalog operations may be supported by every REST catalog implementation. Check compatibility for advanced features like branching, tagging, and view support.
  • Credential vending is not part of the base spec — it's a widely-adopted extension (Apache Polaris, Unity Catalog, S3 Tables). Check whether your client understands the vended-credential response shape before assuming it "just works."
Key Connections
  • scoped_to Iceberg Table Spec, Table Formats — standardizes catalog access for Iceberg
  • used_by DuckDB — the direct-attach path for local-first analytics over S3 Tables
  • solves Vendor Lock-In — engine-agnostic catalog access
  • solves Metadata Overhead at Scale — enables centralized catalog management

Definition

What it is

An open specification defining a RESTful HTTP API for Iceberg catalog operations — listing namespaces and tables, loading table metadata, committing updates, and managing snapshots — independent of any specific catalog backend. Extended in practice with **credential vending**, where the catalog mints short-lived, prefix-scoped S3 credentials at table-load time so clients never hold broad long-lived keys.

Why it exists

Iceberg catalogs were historically tied to specific implementations (Hive Metastore, AWS Glue, Nessie). The REST Catalog Spec decouples catalog clients from catalog backends, enabling multi-engine, multi-language interoperability through a universal HTTP interface.

Primary use cases

Multi-engine Iceberg catalog access, vendor-neutral catalog interoperability, cloud-managed Iceberg catalogs, cross-language catalog clients, local-first analytics (DuckDB `ATTACH '<s3_tables_arn>' AS cat (TYPE iceberg, ENDPOINT_TYPE s3_tables)`), DuckLake metadata-only cloning (`CALL iceberg_to_ducklake(...)`) for interactive querying of multi-terabyte remote tables.

Recent developments

Latest signals
  • Vendors. Official vendor page listing 20+ vendors: Confluent, Crunchy Data, Databricks, dltHub, Firebolt, Fivetran, Google Cloud BigLake, IBM watsonx.data, IOMETE, Microsoft OneLake, Oracle, Snowflake, StarRocks, and more. Per iceberg.apache.org.
  • May 7, 2026: Support for Apache Iceberg™ version 3 (General availability). Snowflake GA support for Iceberg v3. New types: geography, geometry, nanosecond timestamp, variant. Features: default values, deletion vectors, row lineage. Horizon Iceberg REST Catalog API for external engine reads. Per Snowflake Engineering Blog (2026-05-07).
  • [Bug report] Trino creates a table with format_version = 3 for a remote IRC, but the resulting table ends up with format_version = 2. Gravitino Iceberg REST server bug where Trino requests format_version=3 but table is created as v2. Involves credential vending and S3 token generation. Per GitHub (apache/gravitino) (2026-05-07).
  • The REST Catalog is becoming a control-plane substrate, not just a metadata lookup. Two 2026 moves push the spec past "list namespaces, load metadata." First, scan-planning offload: Apache Gravitino's IRC server now performs Iceberg scan planning on behalf of engines like DuckDB and Spark (with a scan-planning cache), shifting planning from client to catalog. Second, the IRC endpoint is now the seam vendors use to expose their own table estates to the open ecosystem — Microsoft OneLake exposes an Iceberg REST Catalog API so external engines (Snowflake, Dremio, Trino) query Microsoft Fabric tables via standard IRC connection strings, with Snowflake↔OneLake bidirectional interoperability now GA. The catalog API is quietly absorbing planning, credential vending, and cross-vendor federation — the control surface of the lakehouse. Per Gravitino 1.2.0 release notes, Microsoft Fabric — Iceberg support in OneLake, and Microsoft — OneLake and Snowflake interoperability is now GA.

Connections 15

Outbound 5
Inbound 10

Resources 6

Featured in