Technology

Cloudian HyperStore

On-prem, S3-compatible, exabyte-scale object storage whose 8.2.6 release is NVIDIA-Certified and supports S3 over RDMA for direct GPU-to-storage data paths.

8 connections 4 resources

Summary

What it is

On-prem, S3-compatible, exabyte-scale object storage whose 8.2.6 release is NVIDIA-Certified and supports S3 over RDMA for direct GPU-to-storage data paths.

Where it fits

It sits beneath GPU clusters as a self-hosted S3 data plane, competing with public-cloud object storage and with software-defined stacks like MinIO and Ceph. Its differentiator is RDMA/GPUDirect throughput rather than just capacity economics, positioning it for AI-factory builds that want object storage to keep pace with NVMe and GPUs.

Misconceptions / Traps
  • The 35 GB/s and 210 GB/s figures require an RDMA/RoCE-capable network fabric; over plain TCP you get standard S3 throughput, not the headline numbers.
  • "NVIDIA-Certified" here is the Foundation level (validated up to 128 GPUs), not an unlimited-scale guarantee.
Key Connections
  • accelerates NVIDIA GPUDirect RDMA for S3 — moves objects into GPU memory bypassing CPU/HTTP.
  • alternative_to MinIO — both are self-hosted S3, but HyperStore is appliance/enterprise-scale with RDMA.
  • solves Egress Cost — on-prem capacity model removes per-GB egress charges.

Definition

What it is

Cloudian HyperStore is an on-premises, exabyte-scalable S3-compatible object storage platform with native S3 API support. It runs on commodity hardware as a fully self-hosted alternative to public-cloud object storage, targeting AI training, fine-tuning, inference, and data-pipeline workloads. Version 8.2.6 is the current NVIDIA-certified release.

Why it exists

It gives teams a private, S3-API-native data plane that can sit directly under GPU clusters, so AI data never has to leave the building or traverse public-cloud egress. Its S3-over-RDMA path lets GPUs pull objects without the HTTP/TCP overhead that normally caps object-storage throughput, making it a credible local-first substrate for LLM and vector workloads.

Primary use cases

AI training data lakes, vector-database storage backends, model checkpointing, GPU-fed inference pipelines, on-prem S3 for regulated/sovereign data, large-scale backup and archive.

Recent developments

Latest signals

Connections 8

Outbound 8

Resources 4