Technology

Technology

Concrete tools, systems, or platforms with version histories and maintainers.

72 nodes
AWS S3 Technology

Amazon's fully managed object storage service — the origin and reference implementation of the S3 API. As of December 2025, the ma…

11 4
MinIO Technology

An open-source, S3-compatible object storage server designed for high performance and self-hosted deployment. As of February 2026,…

9 4
Ceph Technology

A distributed storage system providing object, block, and file storage in a unified platform. S3 compatibility via its RADOS Gatew…

6 3
Apache Ozone Technology

A scalable, distributed object storage system in the Hadoop ecosystem with an S3-compatible interface.

4 3
Apache Iceberg Technology

An open table format for large analytic datasets. Manages metadata, snapshots, and schema evolution for collections of data files …

25 4
Delta Lake Technology

An open table format and storage layer providing ACID transactions, scalable metadata, and schema enforcement on data stored in ob…

12 4
Apache Hudi Technology

A table format and data management framework optimized for incremental data processing — upserts, deletes, and change data capture…

13 4
DuckLake Technology

A lakehouse metadata format that stores table metadata in an embedded SQL database (DuckDB) instead of file-based manifests on S3.…

5 2
DuckDB Technology

An in-process analytical database engine (like SQLite for analytics) that reads Parquet, Iceberg, and other formats directly from …

14 3
Trino Technology

A distributed SQL query engine for federated analytics across heterogeneous data sources, with deep support for S3-backed data lak…

11 4
ClickHouse Technology

A column-oriented DBMS designed for real-time analytical queries, with native support for reading from and writing to S3.

5 4
Apache Spark Technology

A distributed compute engine for large-scale data processing — batch ETL, streaming, SQL, and machine learning — over S3-stored da…

12 4
LanceDB Technology

A vector database that stores data in the Lance columnar format directly on object storage. Designed for serverless vector search …

7 4
Weaviate Technology

An open-source vector database with hybrid search combining BM25 keyword matching and vector similarity in a single query, plus mu…

3 2
Qdrant Technology

A Rust-based vector search engine with native payload filtering and a custom HNSW index implementation that applies metadata filte…

2 2
Milvus Technology

A distributed vector database built for billion-scale similarity search, using a microservices architecture with SSD caching for h…

3 2
StarRocks Technology

An MPP analytical database with native lakehouse capabilities, able to directly query S3 data in Parquet, ORC, and Iceberg formats…

5 3
Apache Flink Technology

A distributed stream processing framework that processes data in real-time, with S3 as checkpoint store, state backend, and output…

9 3
S3 Express One Zone Technology

An AWS S3 storage class delivering single-digit millisecond latency for frequently accessed data. Uses directory buckets in a sing…

8 3
Amazon S3 Tables Technology

An AWS-managed feature providing native Apache Iceberg tables as a built-in S3 capability, with automated compaction, snapshot man…

7 3
Amazon S3 Vectors Technology

A native vector storage and search capability built into S3, enabling storage and querying of embeddings directly in S3 without a …

7 3
Amazon S3 Metadata Technology

An AWS feature that automatically generates queryable metadata tables (in Apache Iceberg format) over S3 objects, enabling SQL-bas…

7 3
SeaweedFS Technology

An open-source distributed storage system with an S3-compatible API, architecturally optimized for billions of small and large fil…

6 3
Cloudflare R2 Technology

An S3-compatible object storage service from Cloudflare with zero egress fees, integrated with the Cloudflare global edge network.

6 3
Backblaze B2 Technology

A low-cost S3-compatible cloud storage service with free egress to CDN partners through the Bandwidth Alliance, designed for cost-…

6 3
Wasabi Technology

An S3-compatible cloud storage service with a fixed pricing model — no egress fees, no API request fees, approximately $5–7/TB/mon…

4 2
VAST Data Technology

A disaggregated all-flash data platform providing unified access via S3, NFS, and SMB protocols, optimized for AI and deep learnin…

4 2
Dell ECS Technology

An enterprise-grade software-defined object storage platform from Dell with S3-compatible API, designed for on-premise and hybrid …

5 2
NetApp StorageGRID Technology

A software-defined S3-compatible object storage system with policy-driven information lifecycle management (ILM), designed for ent…

6 3
Pure Storage FlashBlade Technology

An all-flash unified file and object storage platform from Pure Storage with S3-compatible API, designed for AI, analytics, and mo…

4 2
Garage Technology

A lightweight, self-hosted, geo-distributed S3-compatible object storage system designed for small distributed clusters, edge depl…

5 3
OpenDAL Technology

A unified data access layer providing a single API for accessing 40+ storage backends including S3, GCS, Azure Blob, HDFS, and loc…

4 3
lakeFS Technology

A Git-like version control system for data lakes on S3, providing branching, committing, merging, and rollback for datasets stored…

6 3
Rook Technology

A Kubernetes storage orchestrator that deploys and manages Ceph clusters on Kubernetes, providing K8s-native S3-compatible object …

6 3
GeeseFS Technology

A high-performance FUSE-based filesystem that provides POSIX-compatible access to S3-compatible object storage, optimized for AI/M…

4 2
JuiceFS Technology

A POSIX-compliant distributed filesystem that uses S3-compatible object storage as its data backend and a separate metadata engine…

3 2
Apache Polaris Technology

An open-source REST catalog for Apache Iceberg with centralized RBAC, originally developed by Snowflake and donated to Apache.

7 3
Apache Gravitino Technology

A unified metadata lake — "catalog of catalogs" — that federates Iceberg, Hive, Kafka, and file-based data sources into a single g…

6 3
Unity Catalog Technology

An open-source, multi-format data catalog by Databricks (Linux Foundation), supporting Iceberg, Delta Lake, Hudi, and unstructured…

7 3
Apache XTable Technology

A zero-copy metadata translator (Apache incubating, formerly OneTable) that converts between Iceberg, Delta Lake, and Hudi metadat…

7 3
Delta UniForm Technology

A Delta Lake feature that automatically generates Iceberg and Hudi metadata for Delta tables, enabling cross-format reads without …

6 2
Apache Paimon Technology

An Apache top-level streaming lakehouse table format built on LSM-tree architecture, designed for high-frequency real-time writes …

10 3
Flink CDC Technology

Apache Flink connectors for reading database change logs (MySQL binlog, PostgreSQL WAL) and streaming them directly into lakehouse…

8 3
Estuary Flow Technology

A managed real-time data integration platform with exactly-once connectors for streaming data from databases and SaaS APIs into S3…

5 2
Bytewax Technology

A Python-native stream processing framework built on a Rust-based Timely Dataflow engine, designed for real-time data transformati…

4 2
Apache Airflow Technology

A platform for programmatically authoring, scheduling, and monitoring workflows as directed acyclic graphs (DAGs) written in Pytho…

3 2
RustFS Technology

A high-performance, Rust-based, S3-compatible object storage server positioned as a truly open-source alternative to MinIO.

7 3
Marquez Technology

The reference implementation for OpenLineage — an open-source metadata and lineage service with a web UI for visualizing data flow…

5 3
Apache Ranger Technology

A framework for fine-grained security and centralized auditing across the Hadoop and lakehouse ecosystem, providing column-level a…

6 2
S3 Bucket Key Technology

An S3 feature that reduces KMS API calls by up to 99% by caching encryption key material at the bucket level rather than making in…

3 2
WarpStream Technology

A stateless, S3-native data streaming platform with Kafka protocol compatibility. No local disks, no brokers to manage — all data …

9 2
Apache Doris Technology

A real-time analytical database with native lakehouse capabilities, querying Iceberg, Hudi, and Paimon tables on S3 directly. Late…

7 3
Infinidat Technology

An enterprise storage platform with S3-compatible object storage, delivering hardware-defined performance guarantees at petabyte s…

3 2
SoftIron Technology

A purpose-built, hardware-defined storage appliance providing S3-compatible object storage on Ceph with auditable supply-chain man…

4 2
AWS Glue Catalog Technology

AWS's fully managed metadata catalog service that stores table definitions, partition information, and schema metadata for data st…

9 3
Hive Metastore Technology

The original metadata catalog service from the Apache Hive project that stores table schemas, partition mappings, and storage loca…

8 3
Dremio Technology

A lakehouse query engine that provides SQL analytics directly on S3-stored data with integrated Iceberg table management, data ref…

8 3
Athena Technology

AWS's serverless, pay-per-query SQL engine that runs queries directly against data stored in S3 without requiring infrastructure p…

7 3
Debezium Technology

An open-source distributed platform for change data capture (CDC) that streams row-level changes from databases (PostgreSQL, MySQL…

7 3
DataFusion Technology

An extensible query execution framework written in Rust, built on Apache Arrow, that provides a SQL query planner and execution en…

7 3
Polars Technology

A high-performance DataFrame library written in Rust with Python and Node.js bindings, designed for fast columnar analytics with l…

6 3
Kafka Tiered Storage Technology

An Apache Kafka feature (KIP-405) that offloads older log segments from broker-local disks to S3-compatible object storage, extend…

10 3
Redpanda Technology

A Kafka-compatible streaming platform written in C++ that provides a single binary deployment with built-in Tiered Storage to S3, …

9 3
Project Nessie Technology

An open-source transactional catalog for data lakes that provides Git-like branching, tagging, and commit semantics for Iceberg ta…

9 3
Airbyte Technology

An open-source data integration platform that provides pre-built connectors for extracting data from hundreds of sources (APIs, da…

6 3
Spark Structured Streaming Technology

Apache Spark's stream processing API that enables continuous, micro-batch, or near-real-time ingestion of data streams into S3-bac…

7 3
Velox Technology

A C++ vectorized execution engine developed by Meta that provides a unified, high-performance data processing backend usable by mu…

6 3
dlt Technology

A Python library for declarative data loading (data load tool) that simplifies building data pipelines to extract from APIs and lo…

7 3
OpenMetadata Technology

An open-source metadata platform providing a centralized catalog for data discovery, quality, lineage, and governance across S3-ba…

9 3
DataHub Technology

An open-source metadata platform originally developed at LinkedIn that provides data discovery, lineage tracking, governance, and …

9 3
Apache Atlas Technology

An open-source metadata management and governance framework originally built for the Hadoop ecosystem, providing classification, l…

9 3
Tigris Data Technology

An S3-compatible, globally distributed object storage platform engineered to optimize small-object workloads through metadata inli…

4 2
View in graph →