Tail Latency on Object Storage
The p99 (and p999) end-to-end response-time degradation that emerges when high-concurrency AI workloads run against public-cloud object storage. Average latency may look acceptable — 100ms warm, sub-second cold — but a single sudden demand surge, noisy-neighbor scenario, or network-jitter event can push the **p99 well over 400ms**, and HTTP 503 (Slow Down) throttling responses begin to appear under sustained parallelism.
Definition
The p99 (and p999) end-to-end response-time degradation that emerges when high-concurrency AI workloads run against public-cloud object storage. Average latency may look acceptable — 100ms warm, sub-second cold — but a single sudden demand surge, noisy-neighbor scenario, or network-jitter event can push the **p99 well over 400ms**, and HTTP 503 (Slow Down) throttling responses begin to appear under sustained parallelism.
Connections 3
Outbound 2
scoped_to2Inbound 1
constrained_by1