Technology

Amazon S3 Tables

An AWS-managed feature providing native Apache Iceberg tables as a built-in S3 capability with automated Binpack / Sort / Auto compaction (512MB default target, 64MB minimum), snapshot lifecycle management, and orphan-file garbage collection. Exposes the Iceberg REST Catalog natively and accepts direct Amazon Kinesis writes into table buckets without a Lambda intermediary.

9 connections 6 resources 3 posts

Summary

What it is

An AWS-managed feature providing native Apache Iceberg tables as a built-in S3 capability with automated Binpack / Sort / Auto compaction (512MB default target, 64MB minimum), snapshot lifecycle management, and orphan-file garbage collection. Exposes the Iceberg REST Catalog natively and accepts direct Amazon Kinesis writes into table buckets without a Lambda intermediary.

Where it fits

S3 Tables removes the operational burden of managing Iceberg table lifecycle on S3. Instead of running your own compaction jobs and snapshot expiration, AWS manages it — cutting the Metadata Overhead at Scale pain point and delivering up to 3× query performance and 80% storage reduction (paired with Intelligent-Tiering) versus self-managed Iceberg on standard S3.

Misconceptions / Traps
  • Not a query engine. S3 Tables manages Iceberg metadata and compaction; you still need Spark, Athena, Trino, or DuckDB to read the data.
  • Compaction strategy matters. Binpack handles unsorted tables; Sort (including Z-order) requires a declared sort order but enables much more aggressive file skipping. Auto picks for you but only works well if you've told it what to sort on.
  • Kinesis → S3 Tables direct ingest removes the Lambda hop but doesn't remove the need to think about streaming schema evolution — that still hits Iceberg metadata.
Key Connections
  • implements Iceberg Table Spec — native Iceberg table management
  • implements Iceberg REST Catalog Spec — native REST catalog endpoint
  • augments Compaction — binpack + sort + auto strategies run continuously
  • solves Metadata Overhead at Scale — automated compaction, snapshots, orphan GC
  • scoped_to Lakehouse — managed lakehouse tables on S3
  • constrained_by Vendor Lock-In — AWS-specific managed feature

Definition

What it is

AWS-managed Apache Iceberg tables as a native S3 feature, providing table-level storage optimization, automatic compaction, and snapshot management without external infrastructure. Exposes the **Iceberg REST Catalog Spec** natively so external engines (Spark, Trino, Athena, DuckDB) attach as standard Iceberg clients. Accepts direct Amazon Kinesis writes into table buckets, removing the Lambda compute hop that previously sat between streaming ingest and Iceberg commit.

Why it exists

Operating Iceberg tables on S3 requires managing compaction, snapshot expiry, and metadata cleanup. S3 Tables automates these operations as a built-in S3 feature, reducing operational overhead for lakehouse tables.

Primary use cases

Managed lakehouse tables on S3, zero-ops Iceberg tables, automated compaction and snapshot management, direct Kinesis → Iceberg streaming ingest.

Connections 9

Outbound 9

Resources 6

Featured in