Unity Catalog
An open-source, multi-format data catalog by Databricks (Linux Foundation), supporting Iceberg, Delta Lake, Hudi, and unstructured data with built-in access control and lineage.
Summary
An open-source, multi-format data catalog by Databricks (Linux Foundation), supporting Iceberg, Delta Lake, Hudi, and unstructured data with built-in access control and lineage.
Unity Catalog bridges the Databricks ecosystem with the broader open lakehouse world. For organizations with significant Delta Lake investments that also need Iceberg interoperability, Unity provides a single catalog that spans both formats without requiring XTable or UniForm.
- Open-source Unity Catalog is not identical to the managed Databricks Unity Catalog. Feature parity varies; some governance features are Databricks-only.
- Multi-format support does not mean seamless interoperability. Each format still has its own metadata semantics; Unity provides unified access, not automatic translation.
implementsIceberg REST Catalog Spec — standard REST API for engine-neutral accessenablesDelta Lake, Apache Iceberg — multi-format catalog supportsolvesVendor Lock-In — open alternative to proprietary Databricks catalog
Definition
An open-source, multi-format data catalog originally developed by Databricks and donated to the Linux Foundation. Supports Iceberg, Delta Lake, Hudi, and unstructured data with built-in access control and lineage.
The lakehouse ecosystem needs a catalog that is not tied to a single table format or vendor. Unity Catalog provides a unified governance layer for Delta Lake users who also need Iceberg interoperability, and for organizations standardizing on open catalog APIs.
Multi-format catalog for Delta and Iceberg tables on S3, unified access control across storage and compute, data lineage tracking.
Recent developments
- Latest OSS release: v0.5.0 (GA 2026-06-18). The open-source Unity Catalog server (
unitycatalog/unitycatalog) — adds the UC Delta API for managing Delta tables, per-Spark-version artifacts, and credential-scoped filesystems by default. Distinct from the Databricks-managed UC service described below. Per unitycatalog releases. - Governance primitives reach GA across all three clouds. Per the April 2026 Databricks on Google Cloud release notes, Data Classification and Governed Tags both went GA in April 2026 — the two surface features most organizations bisect "is Unity Catalog production-ready?" against. The May 2026 Databricks on AWS release notes ship Catalog commits to GA, plus app telemetry persistence into Unity Catalog tables (observability becomes a first-class governed dataset, not a side channel). The cumulative effect: UC has crossed the "feature-incomplete" → "default expectation" threshold faster than the open-source counterparts (Apache Polaris, Iceberg REST Catalog) have shipped equivalents.
- MANAGE permissions auto-propagate to materialized views + streaming tables. Per the Lakeflow / Spark Declarative Pipelines 2026 release notes, as of March 2026 Unity Catalog auto-propagates
MANAGEpermissions from base tables to derived MVs and streaming tables — eliminating a real class of permission-drift incidents that previously left downstream artifacts with stale grants. Pipeline settings can now also live inside Unity Catalog table properties rather than as external YAML, collapsing the catalog-vs-config split that's been a quiet operational drag. - CRTAS optimized writes default + Python UDTF custom-dependency support. Per the Databricks SQL 2026 release notes, optimized writes for CRTAS (
CREATE OR REPLACE TABLE AS SELECT) on Unity Catalog tables flipped to default in March 2026 — a quiet but meaningful perf improvement for any pipeline that does idempotent table rebuilds. Unity Catalog Python UDTFs now support custom dependencies, removing the "you can only use what's in the workspace runtime" constraint that pushed teams toward external compute for non-standard libraries. - Full Iceberg GA plus managed MCP servers make Unity Catalog an agent-facing control plane. Databricks moved Managed Iceberg, Foreign Iceberg, and Iceberg v3 to general availability, with Unity Catalog implementing the Iceberg REST Catalog API so external engines read and write managed Iceberg tables through the open standard — UC stops being a Delta-first catalog and becomes a genuinely multi-format one. In parallel, Databricks' managed MCP servers expose Unity Catalog tables, functions, and Vector Search indexes natively to AI agents, so the same governance boundary that scopes human access now scopes agent access. The catalog, not the engine, becomes the place both people and agents are granted (or denied) data. Per Databricks — Unity Catalog and the next era of Apache Iceberg and Databricks — Expanded interoperability with Unity Catalog Open APIs.
Connections 9
Outbound 7
scoped_to2enables2solves1Inbound 2
depends_on1enables1Resources 3
Official Unity Catalog open-source project site with documentation on multi-format catalog capabilities and governance features.
Source repository for open-source Unity Catalog with setup guides, API documentation, and integration examples.
Side-by-side comparison of Unity Catalog and Polaris covering RBAC, format support, and deployment trade-offs.