OpenMetadata
An open-source metadata platform providing a centralized catalog for data discovery, quality, lineage, and governance across S3-based data lakes and lakehouses.
Summary
An open-source metadata platform providing a centralized catalog for data discovery, quality, lineage, and governance across S3-based data lakes and lakehouses.
OpenMetadata sits in the governance and discovery layer above S3 storage and query engines. It ingests metadata from Iceberg tables, Spark jobs, Airflow DAGs, and other tools to provide a unified view of what data exists, who owns it, and how it flows through the organization.
- OpenMetadata is a metadata platform, not a query engine or catalog. It discovers and displays metadata from external systems (Glue, HMS, Iceberg catalogs) but does not replace them.
- Data quality checks in OpenMetadata require configuring profiler workflows. The platform does not automatically validate data without explicit setup.
- Deploying OpenMetadata requires running its own backend services (API server, database, Airflow for ingestion). It is not a lightweight tool.
scoped_toMetadata Management — centralized metadata discovery and governanceenablesAudit Trails — tracks metadata change historyalternative_toDataHub, Apache Atlas — open-source metadata platform alternativesdepends_onAWS Glue Catalog, Hive Metastore — ingests metadata from catalogs
Definition
An open-source metadata platform that provides data discovery, lineage, quality, and governance for S3-based data lakes and lakehouses. Ingests metadata from catalogs, query engines, and pipelines to build a unified metadata graph.
As data lakes grow, teams lose track of what data exists, where it came from, who owns it, and whether it is trustworthy. OpenMetadata centralizes this information with automated metadata ingestion from S3-based sources.
Data discovery and cataloging for S3 lakehouses, automated lineage tracking, data quality monitoring, governance and ownership management.
Recent developments
- Latest release: 1.13.0 (GA June 8, 2026); 1.12.x is the maintained patch line (latest 1.12.11, June 12). 1.13.0 adds MCP Services as a first-class service category plus Knowledge Graph / RDF support. Note the later-dated 1.12.11 is a backport on the older line, not newer than 1.13.0. Per open-metadata/OpenMetadata releases.
- Operational metrics: 94.7% issue-resolution rate, 0.9-hour median PR merge time. Per a 2026 OpenMetadata project-health review, the team posts a 94.7% issue resolution rate and a 0.9-hour median PR merge time. For organizations evaluating community responsiveness as a procurement criterion, these are unusually fast numbers — and they map onto the "younger but actively maintained" framing that OpenMetadata occupies vs the older DataHub project (which has 11,600+ stars and a three-year head start on forks: 3,457 vs OpenMetadata's lower count).
- Honest caveats from independent reviews. Per a 2026 OpenMetadata open-source data-catalog review, the main operational caveat is UI performance at scale — lineage graphs with 500+ nodes can become slow to render in browser. Also flagged: less battle-tested at extreme scale relative to DataHub. Decision framing for 2026: OpenMetadata wins on developer velocity and recent-feature-investment; DataHub wins on production-tested-at-scale references.
Connections 9
Outbound 7
scoped_to2implements1depends_on1solves1alternative_to2Inbound 2
alternative_to2Resources 3
Official OpenMetadata platform for data discovery, lineage, governance, and quality across S3-based data lake assets.
OpenMetadata source repository with connectors for S3, Glue, Iceberg, and other lakehouse components.
OpenMetadata connector reference covering automated metadata ingestion from databases, data lakes, and dashboards.