Technology

Amazon S3 Metadata

An AWS feature that automatically generates queryable metadata tables (in Apache Iceberg format) over S3 objects, enabling SQL-based discovery and governance of object metadata.

7 connections 3 resources

Summary

What it is

An AWS feature that automatically generates queryable metadata tables (in Apache Iceberg format) over S3 objects, enabling SQL-based discovery and governance of object metadata.

Where it fits

S3 Metadata bridges the gap between S3's minimal per-object metadata and the rich, queryable metadata that data governance requires. It automatically creates Iceberg tables from object metadata, queryable via Athena or Spark.

Misconceptions / Traps
  • Not the same as user-defined S3 tags or custom metadata headers. S3 Metadata creates actual Iceberg tables containing system-generated metadata that can be queried with SQL.
  • Metadata tables are generated asynchronously. There is a delay between object creation and metadata availability in the Iceberg table.
Key Connections
  • solves Object Listing Performance — SQL queries replace expensive LIST operations
  • scoped_to Metadata-First Object Storage — the AWS implementation of metadata-first design
  • scoped_to Metadata Management — automated metadata generation and querying

Definition

What it is

AWS feature that automatically generates and maintains a queryable metadata table (Apache Iceberg format) for all objects in a bucket, making object metadata SQL-queryable.

Why it exists

S3 stores billions of objects but provides limited metadata queryability. S3 Metadata surfaces system and custom metadata as queryable Iceberg tables, enabling SQL-based discovery and governance at scale.

Primary use cases

Data governance at scale, SQL-based object discovery, automated metadata-driven lifecycle management, compliance auditing.

Connections 7

Outbound 7

Resources 3