Pain Point

Cold Scan Latency

Slow first-query performance against S3-stored data, caused by object discovery, metadata fetching, and data transfer over HTTP.

41 connections 2 resources 2 posts

Summary

What it is

Slow first-query performance against S3-stored data, caused by object discovery, metadata fetching, and data transfer over HTTP.

Where it fits

Cold scan latency is the fundamental performance trade-off of the separation of storage and compute pattern. Every query against S3 starts with network overhead that does not exist when querying local disk.

Misconceptions / Traps
  • Cold scan latency is not the same as S3 being slow. S3 throughput is high, but initial latency per request is ~50-100ms. For queries touching many files, this adds up.
  • Caching helps with repeat queries but not with the first query. True cold scan mitigation requires metadata-driven pruning (table formats) and intelligent prefetching.
Key Connections
  • Apache Parquet solves Cold Scan Latency — columnar layout enables predicate pushdown
  • Lakehouse Architecture, Hybrid S3 + Vector Index solves Cold Scan Latency — metadata-driven access
  • Separation of Storage and Compute constrained_by Cold Scan Latency — inherent trade-off
  • StarRocks constrained_by Cold Scan Latency — first-query limited by S3 access
  • scoped_to S3, Object Storage

Definition

What it is

The delay experienced on the first query against S3-stored data, caused by object discovery (listing), metadata fetching, and data transfer over the network.

Recent developments

Latest signals
  • CloudTS (FAST 2026) — compacted timeseries metadata to reduce access amplification. Per the USENIX FAST 2026 paper "An Efficient Cloud Storage Model with Compacted Timeseries Metadata", CloudTS proposes separately managing metadata and data on cloud storage to reduce access amplification on cold queries — concrete research-grade evidence that the cold-scan-latency problem is being attacked at the data-model layer, not just by caching.
  • Apache Doris 4.1 — 90% object-storage cost reduction with cold-query optimization. Per VeloDB's Apache Doris 4.1 announcement, Doris 4.1 delivers unified storage and retrieval for AI workloads with cold-query optimization, reporting 90% object-storage cost reduction. The pattern across engines in 2026: Cold Scan Latency is increasingly mitigated through tighter coupling between table-format metadata and engine-side caching rather than per-engine NVMe-cache layers alone.
  • Practitioner mitigation: warm cache sized at 8GB RAM + 140GB NVMe per node holds weeks of timeseries data. Per OpenData's Prometheus-on-object-storage write-up, on an r5d.xlarge node, 8 GB RAM plus 140 GB NVMe disk cache keeps several weeks of timeseries data locally warm; cold queries pay a 10–100ms object-store round-trip. This is the empirical floor: cold-scan latency on warm-cached data approaches local NVMe; on truly cold data it remains tens-of-milliseconds at best.

Connections 41

Outbound 2
Inbound 39click to expand

Resources 2

Featured in