Marquez
The reference implementation for OpenLineage — an open-source metadata and lineage service with a web UI for visualizing data flows across S3-based pipelines.
Summary
The reference implementation for OpenLineage — an open-source metadata and lineage service with a web UI for visualizing data flows across S3-based pipelines.
Marquez is the backend that makes OpenLineage actionable. It collects lineage events from Spark, Airflow, dbt, and other tools, stores them in a searchable database, and provides a UI for engineers to trace data provenance and debug pipeline failures.
- Marquez requires instrumentation. Pipelines must emit OpenLineage events via integrations or SDKs — lineage does not appear automatically.
- Metadata storage can become a bottleneck at massive scale. Production deployments need careful indexing and retention policies.
implementsOpenLineage — reference implementation of the lineage standardenablesLakehouse Architecture — governance and observability layerscoped_toS3, Lakehouse
Definition
An open-source metadata and lineage service that serves as the reference implementation for the OpenLineage standard. Provides a web UI and REST API for collecting, storing, and visualizing data lineage across S3-based data pipelines.
As data pipelines on S3 grow in complexity, engineers need visibility into where data comes from, how it transforms, and where it flows. Marquez collects OpenLineage events from Spark, Airflow, and other tools and provides a searchable, visual lineage graph.
Data lineage visualization for S3 lakehouse pipelines, pipeline debugging and impact analysis, regulatory compliance and data auditing.
Connections 5
Outbound 4
Inbound 1
enables1Resources 3
Official Marquez project site with documentation on deploying the OpenLineage reference implementation and lineage UI.
Source repository with architecture docs, API reference, and integration guides for Spark, Airflow, and dbt.
Comparison of open-source lineage tools covering Marquez's role as the OpenLineage reference implementation.