Directory Namespace / Listing Bottlenecks

Summary

What it is

Performance degradation when navigating deep prefix hierarchies in S3's flat namespace, where listing operations become increasingly expensive as prefix depth and object count grow.

Where it fits

S3's flat namespace simulates directories through prefixes, but the illusion breaks down at scale. Listing objects under a deep prefix requires scanning and filtering — there is no directory index. This bottleneck affects data discovery, table format partition scanning, and lifecycle operations.

Misconceptions / Traps

S3 does not have directories. Prefixes are metadata filters, not filesystem structures. Restructuring prefixes does not create indexes — it only changes the filter pattern.
Directory buckets (S3 Express One Zone) partially address this with a true directory namespace, but are limited to a single AZ and have different pricing.

Key Connections

related_to Object Listing Performance — a more specific manifestation of the listing problem
Directory Buckets / Hot Object Storage solves Directory Namespace / Listing Bottlenecks — true directory structure
Amazon S3 Metadata solves Directory Namespace / Listing Bottlenecks — SQL-based metadata queries
scoped_to S3, Object Storage

Definition

What it is

Performance degradation when using directory-style key naming conventions with deep prefix hierarchies, causing listing operations to slow dramatically as the logical directory tree grows.

Recent developments

Latest signals

S3 Directory Buckets reorganize the namespace into a true hierarchy. Directory buckets organize data hierarchically as opposed to the flat sorting structure of general-purpose buckets — a structural escape from prefix-based pseudo-directories. Required for S3 Express One Zone + optimized for millions of requests/sec per bucket. Per AWS Docs — Working with Directory Buckets.
Directory buckets return UNSORTED ListObjectsV2 results. Specifying prefix=dir1/ limits results to a subdirectory path — but the response is unsorted. Apps relying on lexicographic ordering from ListObjectsV2 need adaptation when moving to directory buckets. Per AWS Docs — Best Practices to Optimize S3 Express One Zone Performance.
ListObjectsV2 performs better when fewer directories are traversed per page. Performance optimization rule for directory buckets: structure your hierarchy so each page-of-results requires traversing fewer subdirectories. Inverts the Hive-era "more granular hierarchy is better" intuition. Per AWS Docs — S3 Express One Zone Performance.
Entropy in prefixes hurts directory-bucket performance. Counter to general-purpose-bucket guidance (which historically recommended random/entropy prefixes to distribute load across partitions), directory buckets internally manage load distribution — so adding entropy is now actively wrong. Per AWS Docs — Directory Buckets.
General-purpose buckets: 10 distinct prefixes → 55,000 GET req/s. S3 auto-partitions general-purpose buckets by key prefix to spread load — with 10 distinct prefixes you can theoretically handle 55K GET req/s. The "use prefixes for performance" rule still applies to general-purpose buckets even as directory buckets supersede the pattern. Per OneUptime — How to Use S3 Prefixes and Partitioning for Better Performance.
Delimiter-based browsing skips deep keys; full recursive listing remains O(N). Even with delimiters, browsing-style listing returns one level of hierarchy + skips deep nested keys (which becomes a separate listing call). Full recursive listing through millions of keys is still O(N) — no namespace structure changes that. Per AWS Docs — Organizing Objects Using Prefixes.

Connections 4

Outbound 2

scoped_to2

S3 Object Storage

Inbound 2

solves2

SeaweedFS S3 Directory Bucket

Resources 2

DocsHigh

docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.ht...

S3 ListObjectsV2 API reference documenting the 1,000-object pagination limit and delimiter-based directory simulation.

BlogMedium

xuanwo.io/2025/02-why-s3-list-objects-taking-120s-to-respond...

Deep investigation into S3 listing latency revealing how delete markers and versioning cause severe performance degradation.