ClickHouse
A column-oriented DBMS designed for real-time analytical queries, with native support for reading from and writing to S3.
Summary
A column-oriented DBMS designed for real-time analytical queries, with native support for reading from and writing to S3.
ClickHouse occupies the performance tier above pure lakehouse queries. It can use S3 as a storage backend (S3-backed MergeTree) while maintaining its own columnar indexes for sub-second query performance — bridging the gap between S3 data lakes and dedicated analytics databases.
- ClickHouse with S3 storage is not the same as querying S3 directly. ClickHouse maintains local indexes and metadata for performance; it uses S3 for durability and cost.
- The S3 table function (for ad-hoc S3 reads) and the S3-backed MergeTree engine (for persistent tables) are different features with different performance characteristics.
depends_onApache Parquet — reads/writes Parquet for S3 interopimplementsSeparation of Storage and Compute — S3-backed storage with independent computescoped_toS3, Lakehouse
Definition
A column-oriented database management system designed for real-time analytical queries, with native support for reading from and writing to S3. The 25.x series (early 2026) added **bidirectional Iceberg read/write**, **Delta Lake INSERT** support, and native **Apache Paimon** compatibility — converting ClickHouse from a hot-tier accelerator into a first-class lakehouse engine. ClickHouse Inc. acquired **Langfuse** (LLM observability) in March 2026, planting an LLM-trace-store flag adjacent to the analytics engine.
Some analytical workloads require sub-second query performance on recent data, which pure S3-backed query engines cannot consistently deliver. ClickHouse uses S3 as a storage backend while maintaining its own columnar indexes for speed; with Iceberg/Delta/Paimon integration it can also serve as the query layer over an open-format lakehouse without rewriting source-of-truth data.
Real-time analytics dashboards backed by S3 storage, log analytics with S3 archival, hybrid hot/cold query patterns, LLM observability stores (Langfuse), bidirectional read/write against open table formats.
Recent developments
- Vector Search GA + 9,000× faster JSON than PostgreSQL JSONB. Per ClickHouse's 2025 roundup, the v25.8 release brought vector search with binary quantization to general availability — ClickHouse joins the analytic-database-with-vector-search shape that Snowflake and Databricks reached earlier, but at OSS-engine speed. The real-time analytics database guide for 2026 reports native JSON throughput 9,000× faster than PostgreSQL JSONB on JSONBench — a real workload pattern for telemetry, event-store, and LLM-trace ingestion (the Langfuse acquisition is downstream of this number).
- Automatic Global Join Reordering — TPC-H SF100 wins by ~1,450×. Per the same engineering guide, v25.09's Automatic Global Join Reordering posted a ~1,450× speedup on TPC-H SF100 vs the prior planner. v25.10 added runtime bloom filters for an additional 2.1× speedup on selective joins. The cumulative effect: ClickHouse closes the "complex JOIN performance" gap that historically pushed teams toward Snowflake or Databricks for multi-fact-table analytics.
- v26.4 lands JSON skip indexes + NATURAL JOIN + parameterized Web UI queries (April 30, 2026). Per the Changelog 2026, v26.4 ships MergeTree skip index support for JSON columns (the missing piece for using native JSON as a queryable column type at scale),
NATURAL JOINsyntax for terser SQL,commit_orderprojection index, and parameterized queries in the Web UI. chDB — the embedded ClickHouse runtime — picked up the v25.8 kernel for a reported 61× performance improvement, keeping the in-process analytical-database story competitive with DuckDB.
Connections 7
Outbound 6
depends_on1augments1Inbound 1
used_by1Resources 4
Official ClickHouse documentation covering the column-oriented OLAP database engine, SQL dialect, and all table engines.
The primary ClickHouse repository — one of the most active C++ database projects, with the full analytical engine source.
ClickHouse's dedicated S3 integration page documents the S3 table function, S3Queue engine, S3-backed MergeTree, and S3 disk configuration.
Detailed ClickHouse changelog tracks every release including S3 engine improvements and storage backend changes.