Architecture

NVMe-backed Object Tier

An architecture placing NVMe flash as a high-performance local storage tier beneath the S3 API, serving hot objects with microsecond-level latency while cold objects remain on HDD or cloud storage.

7 connections 2 resources

Summary

What it is

An architecture placing NVMe flash as a high-performance local storage tier beneath the S3 API, serving hot objects with microsecond-level latency while cold objects remain on HDD or cloud storage.

Where it fits

NVMe-backed tiers eliminate the cold scan latency inherent in HDD-based object stores. By placing frequently accessed objects on NVMe, the architecture delivers flash-speed reads through the standard S3 API — bridging the gap between local SSD performance and S3 ecosystem compatibility.

Misconceptions / Traps

NVMe capacity is expensive per GB. The tier only works economically when a small percentage of objects are hot. Without effective tiering policies, costs escalate quickly.
NVMe tier management adds operational complexity. Cache eviction, promotion policies, and tier migration must be tuned for the workload's access patterns.

Key Connections

depends_on NVMe-oF / NVMe over TCP — the transport for disaggregated flash
solves Cold Scan Latency — flash-speed access for hot objects
scoped_to Tiered Storage, Object Storage

Definition

What it is

Using NVMe flash storage as a high-performance local tier beneath or alongside S3-compatible object storage to reduce tail latency for hot objects and time-sensitive workloads.

Why it exists

Even with separation of storage and compute, some workloads (AI inference, real-time analytics) cannot tolerate S3's HTTP-based latency. An NVMe tier provides sub-millisecond local access for hot data while S3 serves as the durable, cost-effective cold tier.

Primary use cases

AI/ML hot data tier, real-time analytics acceleration, low-latency checkpoint storage, tiered object storage.