Technology
Concrete tools, systems, or platforms with version histories and maintainers.
14 nodesAWS S3
TechnologyAmazon's fully managed object storage service — the origin and reference implementation of the S3 API.
MinIO
TechnologyAn open-source, S3-compatible object storage server designed for high performance and self-hosted deployment.
Ceph
TechnologyA distributed storage system providing object, block, and file storage in a unified platform. S3 compatibility via its RADOS Gateway (RGW).
Apache Ozone
TechnologyA scalable, distributed object storage system in the Hadoop ecosystem with an S3-compatible interface.
Apache Iceberg
TechnologyAn open table format for large analytic datasets. Manages metadata, snapshots, and schema evolution for collections of data files (typically Parquet) ...
Delta Lake
TechnologyAn open table format and storage layer providing ACID transactions, scalable metadata, and schema enforcement on data stored in object storage. Origin...
Apache Hudi
TechnologyA table format and data management framework optimized for incremental data processing — upserts, deletes, and change data capture — on object storage...
DuckDB
TechnologyAn in-process analytical database engine (like SQLite for analytics) that reads Parquet, Iceberg, and other formats directly from S3 without requiring...
Trino
TechnologyA distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lakes and lakehouses.
ClickHouse
TechnologyA column-oriented DBMS designed for real-time analytical queries, with native support for reading from and writing to S3.
Apache Spark
TechnologyA distributed compute engine for large-scale data processing — batch ETL, streaming, SQL, and machine learning — over S3-stored data.
LanceDB
TechnologyA vector database that stores data in the Lance columnar format directly on object storage. Designed for serverless vector search without a separate i...
StarRocks
TechnologyAn MPP analytical database with native lakehouse capabilities, able to directly query S3 data in Parquet, ORC, and Iceberg formats.
Apache Flink
TechnologyA distributed stream processing framework that processes data in real-time, with S3 as checkpoint store, state backend, and output sink.