Apache Gravitino
A unified metadata lake — "catalog of catalogs" — that federates Iceberg, Hive, Kafka, and file-based data sources into a single governance layer. Apache incubating project.
Summary
A unified metadata lake — "catalog of catalogs" — that federates Iceberg, Hive, Kafka, and file-based data sources into a single governance layer. Apache incubating project.
In environments with multiple catalogs (Glue, Hive Metastore, Polaris, Unity), Gravitino sits above them all, providing a unified metadata view. Engineers discover and govern data from a single pane regardless of which catalog or storage layer holds it.
- Gravitino does not replace individual catalogs — it federates them. You still need Polaris, Glue, or Unity underneath.
- Lineage features are still maturing. Production lineage workflows may need supplementation with OpenLineage/Marquez.
implementsIceberg REST Catalog Spec — exposes federated metadata via the standard REST interfaceenablesApache Polaris — can federate Polaris alongside other catalogssolvesVendor Lock-In — unified view across multi-vendor catalog environments
Definition
A unified metadata lake — a "catalog of catalogs" — that provides a single governance layer across Iceberg, Hive, Kafka, and file-based data sources. An Apache incubating project originally developed by Datastrato.
Enterprises run multiple data catalogs (Glue, Hive Metastore, Unity Catalog) across different environments. Gravitino federates these into a unified metadata view so engineers can discover and govern data from a single pane regardless of which catalog or storage layer holds it.
Federated metadata management across hybrid and multi-cloud environments, unified data discovery, cross-catalog governance.
Recent developments
- Latest release: v1.2.1 (current as of June 2026). Tracking the upstream stable release line. Per apache/gravitino releases.
- ASF Board Meeting Minutes. dev@gravitino.apache.org had 5% increase in traffic past quarter (139 vs 132 emails). 2 committer candidate nominations underway. QCon Shanghai, Data for AI meetups (Bay Area, Shanghai), COSCon Beijing talks. Per whimsy.apache.org (2026-04-15).
- ASF Releases, March 2026. Apache release roundup lists gravitino-1.2.0 (2026-03-12) plus multiple Trino connector releases (435-478) for Gravitino. Per community.apache.org (2026-04-01).
- Apache Gravitino 1.2.0. Apache Gravitino 1.2.0 released with Table Maintenance Service (TMS), ClickHouse catalog, end-to-end UDF management, authorization for Iceberg view operation, redesigned Web UI, and broad connector improvements. Per gravitino.apache.org (2026-03-13).
- Gravitino is now a two-sided control plane: it offloads query planning AND feeds AI agents. Two threads converge on the same "catalog as control plane" idea. On the engine side, 1.2.0 (March 13, 2026) lets query engines like DuckDB and Spark offload Iceberg scan planning to Gravitino's IRC server (with a scan-planning cache) — planning moves out of the client and into the catalog, cutting metadata I/O and client-side complexity. On the agent side, Gravitino's AI-native metadata layer — a Model Catalog plus a built-in MCP server that exposes governed metadata to AI tools, both introduced in the 1.1.0 "AI-native metadata management platform" release (December 16, 2025) — lets agents discover and reason over data context through the same governance plane. Per Gravitino 1.2.0 release notes and Gravitino 1.1.0 — An AI-native metadata management platform.
Connections 7
Outbound 7
scoped_to2enables2solves1Resources 3
Official Apache Gravitino project site covering the unified metadata lake architecture and multi-catalog federation.
Source repository with architecture documentation, integration guides, and release history.
Community discussion of the Gravitino 1.0 release covering real-world adoption considerations and comparisons to Unity Catalog.