Apache Doris
A real-time analytical database with native lakehouse capabilities, querying Iceberg, Hudi, and Paimon tables on S3 directly. Late 2025 added native Paimon Deletion Vector support and Hive/FileSystem catalogs.
Summary
A real-time analytical database with native lakehouse capabilities, querying Iceberg, Hudi, and Paimon tables on S3 directly. Late 2025 added native Paimon Deletion Vector support and Hive/FileSystem catalogs.
Doris bridges the gap between real-time serving and lakehouse analytics. Rather than requiring a separate engine for interactive dashboards vs. batch analytics, Doris provides sub-second queries directly on S3-stored lakehouse tables with native support for all major table formats.
- Native lakehouse support does not mean Doris replaces the table format engine. Doris reads lakehouse tables but does not manage compaction, snapshot expiry, or table maintenance — those remain the responsibility of Iceberg/Hudi/Paimon.
- Sub-second performance depends on query patterns and data layout. Complex joins over large unpartitioned tables on S3 may not achieve interactive latency.
reads_fromApache Iceberg, Apache Hudi, Apache Paimon — native lakehouse table readingimplementsS3 API — direct S3 data accesssolvesCold Scan Latency — interactive performance on S3 data
Definition
A real-time analytical database with native lakehouse capabilities, supporting direct queries over Apache Iceberg, Hudi, and Paimon tables on S3. In late 2025 added native support for Paimon Deletion Vectors and Hive/FileSystem catalogs.
Real-time analytics on S3-based lakehouses traditionally requires multiple engines — one for ingestion, another for serving. Doris combines real-time ingestion with sub-second query performance, querying S3-stored lakehouse tables directly without requiring data movement.
Real-time analytics over S3 lakehouse tables, sub-second dashboards on Iceberg/Hudi/Paimon data, unified real-time and batch query serving.
Recent developments
- Latest releases: 4.1.2 (June 2026) on the "Latest" feature branch; 4.0.6 (June 2026) on the "Stable" branch. Doris ships two parallel lines — Stable (continuous bug fixes, the production recommendation; currently 4.0.6) and Latest (newest features for evaluation; currently 4.1.2). Production deployments should track the 4.0 Stable line; the 4.1 features below are on the Latest line. Per Apache Doris version rules and all releases.
- Apache Doris 4.1.0 (April 21, 2026) — unified storage and retrieval for AI + search. Doris 4.1 extends the AI/agent foundation from 4.0 with two new vector index types — IVF and IVF_ON_DISK — that scale vector retrieval to billion- and trillion-vector datasets. The
search()function now supports BM25 scoring with Elasticsearch-compatible syntax, so full-text search and analytics share one SQL surface. Native support for single JSON documents up to 100 MB targets agent memory and long-context AI workloads. On the OLAP side: +22.6% on TPC-H, +19.1% on TPC-DS, +14.3% on SSB vs 4.0, and ranks first on ClickBench cold-query. - Lakehouse parity — full Iceberg V2/V3 read+write, Paimon DDL via SQL. Per the velodb deep-dive on 4.1, Doris 4.1 supports full Apache Iceberg V2 and V3 read and write (including the v3 deletion-vector path), Apache Paimon DDL management directly via SQL, and a +20% Parquet Page Cache uplift on cold reads. Combined with vector + full-text + structured filtering in one engine, this positions Doris as a single-system replacement for "OLTP + analytics + AI" stacks that previously required two or three engines.
- Doris 34x faster than ClickHouse on real-time updates (vendor benchmark). A vendor-published benchmark measured Doris up to 34× faster than ClickHouse on real-time update workloads — the workload pattern where ClickHouse historically struggles (heavy concurrent UPSERT). The headline number comes from a vendor source so weight accordingly, but it reinforces a structural difference: Doris was designed primary-key-first, ClickHouse mutation-second.
- 70% better price-performance on AWS Graviton4 (ARM). Independent benchmark on AWS Graviton4 across five OLAP suites (ClickBench, SSB 100G, SSB-Flat, TPC-H, TPC-DS) shows Doris on ARM-based Graviton consistently delivers 54–70% higher price-performance vs equivalent x86 instances, attributed to vectorized CPU instruction usage and ARM Neoverse N3 multithreading. The architectural takeaway: Doris is one of the OLAP engines that meaningfully compounds the cloud-ARM cost shift, not just runs on it.
Connections 7
Outbound 7
implements1reads_from3solves1Resources 3
Official Apache Doris project site with documentation on real-time analytics, lakehouse integration, and deployment.
Doris lakehouse documentation covering native Iceberg, Hudi, and Paimon table support with Deletion Vector compatibility.
Source repository with lakehouse connector implementations, catalog integration, and release notes for 2025 features.