Technology

Velox

A C++ vectorized execution engine developed by Meta that provides a unified, high-performance data processing backend usable by multiple front-end query engines including Presto, Spark, and custom data systems.

6 connections 3 resources

Summary

What it is

Where it fits

Velox sits beneath query planners as a shared execution layer. For S3-backed workloads, it accelerates scan, filter, aggregation, and join operations against Parquet files on object storage, and is the engine behind Presto's Velox-based execution (Prestissimo).

Misconceptions / Traps

Velox is not a standalone query engine. It is an execution library that must be embedded in a host system (Presto, Spark via Gluten, or a custom application).
Velox's performance gains come from vectorized execution and adaptive filtering, not from caching. It still needs to read data from S3 on cache misses.
Integration with existing query engines (e.g., Spark via Gluten project) is still maturing. Not all Spark operations have Velox equivalents.

Key Connections

scoped_to S3, Lakehouse — accelerates query execution over S3 data
depends_on Apache Arrow — uses Arrow-compatible columnar memory layout
enables Trino — Prestissimo uses Velox as its execution engine
enables Apache Spark — Gluten project integrates Velox with Spark

Definition

What it is

A C++ vectorized database acceleration library created by Meta, designed to be embedded into query engines to provide a unified, high-performance execution layer for data processing on S3-stored data.

Why it exists

Multiple query engines (Spark, Presto, Flink) each implement their own execution runtimes with varying performance characteristics. Velox provides a shared, hardware-optimized execution core that any engine can embed, raising the performance floor for S3-based analytics.

Primary use cases

Accelerating Spark and Presto queries over S3 data, unified vectorized execution for lakehouse queries, hardware-optimized data processing.

Recent developments

Latest signals

Axiom — composable query engines built on Velox (announced April 23, 2026). Per the facebookincubator/velox repository, the Velox project announced Axiom, a framework for composing query engines on top of the Velox execution core — the next architectural step beyond "embed Velox into your existing engine," toward "use Velox + Axiom as the foundation of a new engine without rebuilding the execution primitives from scratch." For lakehouse architects this matters because it lowers the cost of net-new analytical engine development, which previously required hand-rolling a vectorized execution layer.
Velox-based execution shows up in independent benchmarks alongside DataFusion. Per the BARQ vectorized SPARQL paper (arXiv:2504.04584), Velox is consistently cited as one of the open-source modular execution engines (alongside Apache DataFusion) that new analytical-system research builds on. Meta's original Velox introduction blog (engineering.fb.com, 2023) remains the canonical reference for what Velox is and why; the 2026 update is that the embed-it-into-engines bet has paid off and Velox is now infrastructure rather than research code.