Velox
A C++ vectorized execution engine developed by Meta that provides a unified, high-performance data processing backend usable by multiple front-end query engines including Presto, Spark, and custom data systems.
Summary
A C++ vectorized execution engine developed by Meta that provides a unified, high-performance data processing backend usable by multiple front-end query engines including Presto, Spark, and custom data systems.
Velox sits beneath query planners as a shared execution layer. For S3-backed workloads, it accelerates scan, filter, aggregation, and join operations against Parquet files on object storage, and is the engine behind Presto's Velox-based execution (Prestissimo).
- Velox is not a standalone query engine. It is an execution library that must be embedded in a host system (Presto, Spark via Gluten, or a custom application).
- Velox's performance gains come from vectorized execution and adaptive filtering, not from caching. It still needs to read data from S3 on cache misses.
- Integration with existing query engines (e.g., Spark via Gluten project) is still maturing. Not all Spark operations have Velox equivalents.
scoped_toS3, Lakehouse — accelerates query execution over S3 datadepends_onApache Arrow — uses Arrow-compatible columnar memory layoutenablesTrino — Prestissimo uses Velox as its execution engineenablesApache Spark — Gluten project integrates Velox with Spark
Definition
A C++ vectorized database acceleration library created by Meta, designed to be embedded into query engines to provide a unified, high-performance execution layer for data processing on S3-stored data.
Multiple query engines (Spark, Presto, Flink) each implement their own execution runtimes with varying performance characteristics. Velox provides a shared, hardware-optimized execution core that any engine can embed, raising the performance floor for S3-based analytics.
Accelerating Spark and Presto queries over S3 data, unified vectorized execution for lakehouse queries, hardware-optimized data processing.
Recent developments
- Axiom — composable query engines built on Velox (announced April 23, 2026). Per the facebookincubator/velox repository, the Velox project announced Axiom, a framework for composing query engines on top of the Velox execution core — the next architectural step beyond "embed Velox into your existing engine," toward "use Velox + Axiom as the foundation of a new engine without rebuilding the execution primitives from scratch." For lakehouse architects this matters because it lowers the cost of net-new analytical engine development, which previously required hand-rolling a vectorized execution layer.
- Velox-based execution shows up in independent benchmarks alongside DataFusion. Per the BARQ vectorized SPARQL paper (arXiv:2504.04584), Velox is consistently cited as one of the open-source modular execution engines (alongside Apache DataFusion) that new analytical-system research builds on. Meta's original Velox introduction blog (engineering.fb.com, 2023) remains the canonical reference for what Velox is and why; the 2026 update is that the embed-it-into-engines bet has paid off and Velox is now infrastructure rather than research code.
Connections 6
Outbound 6
Resources 3
Official Velox site for Meta's open-source C++ vectorized execution library used as the backend for Presto, Spark, and other query engines.
Velox source repository with the vectorized execution engine, S3 connector, and Parquet/ORC reader implementations.
Meta engineering blog introducing Velox's architecture and how it accelerates analytical workloads over object storage.