Guide 30

Small Object Storage at Scale — Overcoming the Latency Tax

Problem Framing

Standard S3 imposes per-request latency overhead regardless of object size: a 1 KB GET takes the same round-trip time as a 1 MB GET, but returns 1,000x less data. Workloads dominated by millions of kilobyte-sized objects (log events, ML feature vectors, IoT telemetry readings) pay a severe latency and API cost tax — S3 GET pricing at $0.0004 per 1,000 requests means reading 100 million 1 KB objects costs $40 in API fees alone, for 100 GB of actual data. Architectures that inline small payloads into metadata, coalesce adjacent keys, or use object-storage engines with LSM-backed caching can deliver sub-10ms latency at a fraction of the cost.

Relevant Nodes

Topics: S3, Object Storage
Technologies: Tigris Data, S3 Express One Zone
Standards: S3 Directory Bucket
Pain Points: Small Files Problem, Small Files Amplification, Request Amplification, Request Pricing Models, Object Listing Performance

Decision Path

Quantify your small-object distribution. Use S3 Inventory or S3 Storage Lens to profile object size distribution across your buckets. Identify the percentage of objects under 64 KB, under 1 KB, and under 256 bytes. If more than 50% of objects are under 64 KB, you have a small-object-dominant workload.
Calculate API cost impact. Multiply your monthly GET request count by the per-request price. Compare this with the actual data volume retrieved. If API costs exceed storage costs, your workload is request-cost-dominated — the standard S3 pricing model is working against you.
Evaluate Tigris Data for metadata-inlined storage. Tigris Data is an S3-compatible object store that inlines small objects directly into its metadata layer, bypassing the separate data fetch that standard S3 requires for every GET. For objects under 4 KB, this eliminates the per-object I/O overhead entirely.
- Tigris benchmarks show 2–5x lower latency than standard S3 for small-object workloads.
- S3 API-compatible — no application changes required for migration.
Consider S3 Express One Zone for latency reduction. S3 Express One Zone delivers single-digit millisecond first-byte latency regardless of object size. For small-object workloads where latency (not API cost) is the primary concern, Express One Zone reduces the per-request overhead.
- Express One Zone is single-AZ (no cross-AZ durability) and more expensive per GB. Use as a cache tier, not primary storage.
- Directory Bucket namespace reduces LIST overhead for prefix-heavy key patterns.
Design key coalescing for adjacent small objects. If small objects have natural adjacency (sequential log entries, time-series readings), coalesce them into larger composite objects at write time. Store an index mapping original keys to byte offsets within the composite object.
- This reduces object count by 100–1,000x, proportionally reducing API costs and LIST overhead.
- Trade-off: individual object access requires a range GET instead of a simple GET.
Benchmark read and write latency vs. standard S3. Test with your actual object size distribution, access pattern (random vs. sequential), and concurrency level. Measure p50, p99, and p999 latency — small-object tail latency can be significantly worse than median latency on standard S3 due to per-request overhead variance.

What Changed Over Time

Standard S3 was designed for objects in the MB-to-GB range. Its per-request pricing and latency model assumes each GET retrieves meaningful data volume.
Small-object workloads grew rapidly with IoT, ML feature stores, and event-driven architectures, exposing the mismatch between S3's pricing model and these access patterns.
S3 Express One Zone (2023) addressed the latency dimension but not the API cost dimension.
Tigris Data (2024) attacked the problem at the storage engine level by inlining small objects into metadata, eliminating the per-object I/O overhead.
The pattern: small-object storage is fragmenting into a specialized tier, separate from general-purpose S3, driven by the economic and latency penalties of treating every object identically.

Problem Framing

Relevant Nodes

Decision Path

What Changed Over Time

Sources