Technology

Apache Gravitino

A unified metadata lake — "catalog of catalogs" — that federates Iceberg, Hive, Kafka, and file-based data sources into a single governance layer. Apache incubating project.

6 connections 3 resources

Summary

What it is

A unified metadata lake — "catalog of catalogs" — that federates Iceberg, Hive, Kafka, and file-based data sources into a single governance layer. Apache incubating project.

Where it fits

In environments with multiple catalogs (Glue, Hive Metastore, Polaris, Unity), Gravitino sits above them all, providing a unified metadata view. Engineers discover and govern data from a single pane regardless of which catalog or storage layer holds it.

Misconceptions / Traps
  • Gravitino does not replace individual catalogs — it federates them. You still need Polaris, Glue, or Unity underneath.
  • Lineage features are still maturing. Production lineage workflows may need supplementation with OpenLineage/Marquez.
Key Connections
  • implements Iceberg REST Catalog Spec — exposes federated metadata via the standard REST interface
  • enables Apache Polaris — can federate Polaris alongside other catalogs
  • solves Vendor Lock-In — unified view across multi-vendor catalog environments

Definition

What it is

A unified metadata lake — a "catalog of catalogs" — that provides a single governance layer across Iceberg, Hive, Kafka, and file-based data sources. An Apache incubating project originally developed by Datastrato.

Why it exists

Enterprises run multiple data catalogs (Glue, Hive Metastore, Unity Catalog) across different environments. Gravitino federates these into a unified metadata view so engineers can discover and govern data from a single pane regardless of which catalog or storage layer holds it.

Primary use cases

Federated metadata management across hybrid and multi-cloud environments, unified data discovery, cross-catalog governance.

Connections 6

Outbound 6

Resources 3