OpenMetadata
An open-source metadata platform providing a centralized catalog for data discovery, quality, lineage, and governance across S3-based data lakes and lakehouses.
Summary
An open-source metadata platform providing a centralized catalog for data discovery, quality, lineage, and governance across S3-based data lakes and lakehouses.
OpenMetadata sits in the governance and discovery layer above S3 storage and query engines. It ingests metadata from Iceberg tables, Spark jobs, Airflow DAGs, and other tools to provide a unified view of what data exists, who owns it, and how it flows through the organization.
- OpenMetadata is a metadata platform, not a query engine or catalog. It discovers and displays metadata from external systems (Glue, HMS, Iceberg catalogs) but does not replace them.
- Data quality checks in OpenMetadata require configuring profiler workflows. The platform does not automatically validate data without explicit setup.
- Deploying OpenMetadata requires running its own backend services (API server, database, Airflow for ingestion). It is not a lightweight tool.
scoped_toMetadata Management — centralized metadata discovery and governanceenablesAudit Trails — tracks metadata change historyalternative_toDataHub, Apache Atlas — open-source metadata platform alternativesdepends_onAWS Glue Catalog, Hive Metastore — ingests metadata from catalogs
Definition
An open-source metadata platform that provides data discovery, lineage, quality, and governance for S3-based data lakes and lakehouses. Ingests metadata from catalogs, query engines, and pipelines to build a unified metadata graph.
As data lakes grow, teams lose track of what data exists, where it came from, who owns it, and whether it is trustworthy. OpenMetadata centralizes this information with automated metadata ingestion from S3-based sources.
Data discovery and cataloging for S3 lakehouses, automated lineage tracking, data quality monitoring, governance and ownership management.
Connections 9
Outbound 7
scoped_to2implements1depends_on1solves1alternative_to2Inbound 2
alternative_to2Resources 3
Official OpenMetadata platform for data discovery, lineage, governance, and quality across S3-based data lake assets.
OpenMetadata source repository with connectors for S3, Glue, Iceberg, and other lakehouse components.
OpenMetadata connector reference covering automated metadata ingestion from databases, data lakes, and dashboards.