Flink CDC
Apache Flink connectors for reading database change logs (MySQL binlog, PostgreSQL WAL) and streaming them directly into lakehouse formats on S3 without an intermediate message broker.
Summary
Apache Flink connectors for reading database change logs (MySQL binlog, PostgreSQL WAL) and streaming them directly into lakehouse formats on S3 without an intermediate message broker.
Flink CDC removes Kafka from the CDC pipeline. Instead of Database → Debezium → Kafka → Flink → S3, the architecture becomes Database → Flink CDC → S3. This reduces latency, operational complexity, and infrastructure costs for database-to-lakehouse replication.
- Eliminating Kafka also eliminates its replay buffer. If the Flink job fails, replay must come from the database logs, which may have limited retention.
- Memory usage can be significant under high-throughput workloads. Capacity planning for Flink CDC is critical.
depends_onApache Flink — runs as Flink connectorsenablesApache Paimon, Apache Iceberg, Apache Hudi — writes CDC data directly to lakehouse formatsscoped_toTable Formats — ingestion framework for S3-based table formats
Definition
A set of Apache Flink connectors that read database change logs (MySQL binlog, PostgreSQL WAL, MongoDB oplog) and stream them directly into lakehouse table formats on S3, without requiring an intermediate message broker.
Traditional CDC pipelines require Kafka or a similar message queue between the source database and the lake. Flink CDC eliminates this intermediate layer by reading change logs directly and writing to Iceberg, Paimon, or Hudi on S3, reducing operational complexity and latency.
Database-to-lakehouse replication without Kafka, real-time data mirroring from operational databases to S3, streaming CDC ingestion into Iceberg or Paimon tables.
Connections 8
Outbound 6
Inbound 2
used_by1depends_on1Resources 3
Official Flink CDC documentation covering supported databases, connector configuration, and pipeline setup.
Source repository with connector implementations, version compatibility matrix, and migration guides.
Engineering guide to CDC strategies for Iceberg covering Flink CDC as a Kafka-free alternative.