ORC
Optimized Row Columnar file format specification — a columnar format with built-in indexing, compression, and predicate pushdown support, originally developed for the Hive ecosystem.
Summary
Optimized Row Columnar file format specification — a columnar format with built-in indexing, compression, and predicate pushdown support, originally developed for the Hive ecosystem.
ORC is the legacy columnar format in the Hadoop/Hive ecosystem. On S3, it serves the same role as Parquet — efficient columnar storage for analytical queries — but is primarily used in organizations with existing Hive investments.
- ORC and Parquet are functionally similar for most workloads. The choice is usually driven by ecosystem (Hive → ORC, everything else → Parquet) rather than technical superiority.
- ORC's built-in ACID support (for Hive) operates differently from table format ACID (Iceberg, Delta). They are not the same concept.
used_byApache Spark, Trino — supported as a data file formatsolvesCold Scan Latency — columnar format enables predicate pushdownscoped_toS3, Table Formats
Definition
Optimized Row Columnar file format specification. A columnar format with built-in indexing, compression, and predicate pushdown support, originally developed for the Hive ecosystem.
ORC predates Parquet in the Hadoop ecosystem and remains in use in organizations with significant Hive and Spark-on-YARN investments. It provides similar benefits to Parquet (columnar storage, efficient analytics) with different performance trade-offs.
Analytical data storage in Hive-centric S3 environments, legacy Hadoop data lake compatibility.
Connections 5
Outbound 5
Resources 3
The authoritative ORC file format specification defining the stripe structure, type system, encoding schemes, compression, indexes, and file footer layout.
Official Apache ORC documentation covering configuration, Hive/Spark integration, ACID support, and performance tuning.
Canonical repository containing the C++ and Java implementations of the ORC format, plus the specification source and test files.