Trino
A distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lakes and lakehouses.
Summary
A distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lakes and lakehouses.
Trino is the multi-engine query layer for S3 lakehouses. It queries Iceberg, Delta, Hudi, and raw Parquet on S3 through connectors — and can join S3 data with operational databases in a single query.
- Trino is a query engine, not a storage engine. It reads from S3 but does not manage data. Writes go through table format commit protocols.
- Trino requires a coordinator and workers — operational overhead is higher than DuckDB. Use DuckDB for single-user exploration; Trino for multi-user production queries.
depends_onApache Parquet — reads Parquet files from S3used_byLakehouse Architecture — a primary query engine for lakehousesconstrained_bySmall Files Problem, Object Listing Performance — performance affected by S3 access patterns- Natural Language Querying
augmentsTrino — LLMs generate SQL for Trino scoped_toS3, Lakehouse
Definition
A distributed SQL query engine designed for federated analytics across heterogeneous data sources, with deep support for querying S3-backed data lakes and lakehouses.
Organizations store data across many systems. Trino provides a single SQL interface to query data wherever it lives — including directly on S3 via Parquet, ORC, Iceberg, Delta, and Hudi connectors — without moving data.
Federated SQL across S3-backed sources, interactive lakehouse queries, cross-source joins between S3 data and operational databases.
Recent developments
- Trino 478 and 479 ship community broadcast cadence. Per the Trino Community Broadcast episode list, Trino 478 and 479 continue the project's high-cadence release pattern with topics covering virtual view hierarchies (with Rob Dickinson), AI agents for query development, Trino Query UI updates, and Trino Gateway 16 → 18 progression. The community-broadcast cadence is itself a competitive signal: Trino maintains the analyst-engagement velocity that closed-source data-warehouse vendors struggle to match.
- Trino Gateway 18 — in-memory caching + Java 25. Per the Trino Gateway release notes, Gateway 18 adds in-memory caching of backend metadata, query-history deactivation, and UI timezone selection. Gateway 17 (the predecessor) requires Java 25 and ships on the UBI10 micro base image with JMX metrics enabled. For organizations running Trino as a federated query layer over S3-backed lakehouses, the Gateway tier is now first-class infrastructure with its own release cadence rather than an afterthought.
Connections 11
Outbound 6
depends_on1used_by1constrained_by2Inbound 5
used_by3augments2Resources 4
Official Trino documentation covering the distributed SQL query engine architecture, connectors, and query execution.
Main Trino source repository (formerly PrestoSQL) including all connectors, the query optimizer, and the execution engine.
Trino's object storage documentation details how to configure S3 as the backing store for Hive, Iceberg, Delta Lake, and Hudi connectors.
The Iceberg connector docs are a key S3 integration point, showing how Trino queries Iceberg table format data stored on S3.