Technology

StarTree Cloud

A managed Apache Pinot platform that serves sub-second, high-concurrency analytics directly on Apache Iceberg and Parquet tables in object storage, with no ETL into a separate store.

7 connections 3 resources

Summary

What it is

A managed Apache Pinot platform that serves sub-second, high-concurrency analytics directly on Apache Iceberg and Parquet tables in object storage, with no ETL into a separate store.

Where it fits

It sits at the serving layer of a lakehouse, competing with batch lakehouse engines (Trino, ClickHouse) on latency and cost-per-query rather than on ad-hoc SQL breadth. For S3-native data infra, it turns Iceberg-on-S3 into a directly servable, user-facing analytics substrate.

Misconceptions / Traps

Native Iceberg querying shipped July 2025; the eye-catching 9-39x numbers come from a separate May 2026 benchmark, not the launch post.
Headline QPS (up to 498) depends on index pinning; cold first-touch queries against object storage are slower.

Key Connections

reads_from Apache Iceberg — queries Iceberg/Parquet directly without conversion to Pinot segments.
competes_with Trino — same Iceberg-on-S3 data, but optimized for low-latency high-QPS serving.
enables Lakehouse — adds an interactive serving tier on top of lakehouse storage.

Definition

What it is

StarTree Cloud is a fully managed real-time analytics platform built on Apache Pinot, the open-source OLAP database for low-latency, high-concurrency queries. It extends Pinot with enterprise features and added native querying of Apache Iceberg and Parquet tables directly on object storage. The goal is interactive, external-facing analytics served straight from a lakehouse.

Why it exists

Iceberg tables on S3 are normally queried by batch engines (Trino, ClickHouse) with seconds-to-minutes latency. StarTree lets you serve sub-second, high-QPS queries directly off Iceberg/Parquet on object storage without ETL into a separate serving store, collapsing the lakehouse-to-serving-layer copy. That makes object storage itself the serving substrate for user-facing analytics.

Primary use cases

User-facing/external analytics, real-time dashboards on lakehouse data, high-QPS aggregations over Iceberg, anomaly and metrics serving, lakehouse query acceleration without data duplication.

Recent developments

Latest signals

StarTree added native Apache Iceberg support, letting Pinot query Iceberg and Parquet tables directly with no data transformation or duplication. Announced July 23, 2025, it removes the prior requirement to extract and convert lakehouse data into Pinot's native segment format. Per StarTree Adds Native Iceberg Support.
A May 2026 benchmark on a 12.2-billion-row (~770 GB) Iceberg dataset showed StarTree 9-39x faster than ClickHouse OSS and 4-21x faster than Trino, with no local storage. Cold-cache latencies ranged from 491 ms (filtered count) to 2.4 s (map column) versus multi-second times for the others. Per StarTree Benchmark: Iceberg Query Performance vs. Trino vs. ClickHouse.
With index pinning StarTree sustained up to 498 QPS at 289 ms average latency on the same Iceberg data. Cost per query was $0.0012 sequential — roughly 7.4x cheaper than Trino and 14.9x cheaper than ClickHouse — at 7.1% server CPU versus 31.3% (Trino) and 44.8% (ClickHouse). Per StarTree Benchmark: Iceberg Query Performance vs. Trino vs. ClickHouse.
It achieves direct Iceberg serving via a Parquet forward-index reader plus aggressive pruning, transferring less data from object storage per query. Per Low Latency Serving on Iceberg with Apache Pinot, in StarTree Cloud.