Technology

Trino

A distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lakes and lakehouses.

12 connections 4 resources 1 post

Summary

What it is

A distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lakes and lakehouses.

Where it fits

Trino is the multi-engine query layer for S3 lakehouses. It queries Iceberg, Delta, Hudi, and raw Parquet on S3 through connectors — and can join S3 data with operational databases in a single query.

Misconceptions / Traps

Trino is a query engine, not a storage engine. It reads from S3 but does not manage data. Writes go through table format commit protocols.
Trino requires a coordinator and workers — operational overhead is higher than DuckDB. Use DuckDB for single-user exploration; Trino for multi-user production queries.

Key Connections

depends_on Apache Parquet — reads Parquet files from S3
used_by Lakehouse Architecture — a primary query engine for lakehouses
constrained_by Small Files Problem, Object Listing Performance — performance affected by S3 access patterns
Natural Language Querying augments Trino — LLMs generate SQL for Trino
scoped_to S3, Lakehouse

Definition

What it is

A distributed SQL query engine designed for federated analytics across heterogeneous data sources, with deep support for querying S3-backed data lakes and lakehouses. Trino is the project **formerly known as PrestoSQL** — the community fork that split from Facebook's original **Presto** in 2019 and now carries the active open-source lineage; "Presto"-labeled analytics content in the modern lakehouse context generally maps here.

Why it exists

Organizations store data across many systems. Trino provides a single SQL interface to query data wherever it lives — including directly on S3 via Parquet, ORC, Iceberg, Delta, and Hudi connectors — without moving data.

Primary use cases

Federated SQL across S3-backed sources, interactive lakehouse queries, cross-source joins between S3 data and operational databases.

Recent developments

Latest signals

Latest release: 481 (May 12, 2026). Trino ships frequent single-number releases. Per trinodb/trino releases.
Trino 478 and 479 ship community broadcast cadence. Per the Trino Community Broadcast episode list, Trino 478 and 479 continue the project's high-cadence release pattern with topics covering virtual view hierarchies (with Rob Dickinson), AI agents for query development, Trino Query UI updates, and Trino Gateway 16 → 18 progression. The community-broadcast cadence is itself a competitive signal: Trino maintains the analyst-engagement velocity that closed-source data-warehouse vendors struggle to match.
Trino Gateway 18 — in-memory caching + Java 25. Per the Trino Gateway release notes, Gateway 18 adds in-memory caching of backend metadata, query-history deactivation, and UI timezone selection. Gateway 17 (the predecessor) requires Java 25 and ships on the UBI10 micro base image with JMX metrics enabled. For organizations running Trino as a federated query layer over S3-backed lakehouses, the Gateway tier is now first-class infrastructure with its own release cadence rather than an afterthought.