Technology

Amazon S3 Tables

An AWS-managed feature providing native Apache Iceberg tables as a built-in S3 capability with automated Binpack / Sort / Auto compaction (512MB default target, 64MB minimum), snapshot lifecycle management, and orphan-file garbage collection. Exposes the Iceberg REST Catalog natively and accepts direct Amazon Kinesis writes into table buckets without a Lambda intermediary.

11 connections 6 resources 5 posts

Summary

What it is

An AWS-managed feature providing native Apache Iceberg tables as a built-in S3 capability with automated Binpack / Sort / Auto compaction (512MB default target, 64MB minimum), snapshot lifecycle management, and orphan-file garbage collection. Exposes the Iceberg REST Catalog natively and accepts direct Amazon Kinesis writes into table buckets without a Lambda intermediary.

Where it fits

S3 Tables removes the operational burden of managing Iceberg table lifecycle on S3. Instead of running your own compaction jobs and snapshot expiration, AWS manages it — cutting the Metadata Overhead at Scale pain point and delivering up to 3× query performance and 80% storage reduction (paired with Intelligent-Tiering) versus self-managed Iceberg on standard S3.

Misconceptions / Traps
  • Not a query engine. S3 Tables manages Iceberg metadata and compaction; you still need Spark, Athena, Trino, or DuckDB to read the data.
  • Compaction strategy matters. Binpack handles unsorted tables; Sort (including Z-order) requires a declared sort order but enables much more aggressive file skipping. Auto picks for you but only works well if you've told it what to sort on.
  • Kinesis → S3 Tables direct ingest removes the Lambda hop but doesn't remove the need to think about streaming schema evolution — that still hits Iceberg metadata.
Key Connections
  • implements Iceberg Table Spec — native Iceberg table management
  • implements Iceberg REST Catalog Spec — native REST catalog endpoint
  • augments Compaction — binpack + sort + auto strategies run continuously
  • solves Metadata Overhead at Scale — automated compaction, snapshots, orphan GC
  • scoped_to Lakehouse — managed lakehouse tables on S3
  • constrained_by Vendor Lock-In — AWS-specific managed feature

Definition

What it is

AWS-managed Apache Iceberg tables as a native S3 feature, providing table-level storage optimization, automatic compaction, and snapshot management without external infrastructure. Exposes the **Iceberg REST Catalog Spec** natively so external engines (Spark, Trino, Athena, DuckDB) attach as standard Iceberg clients. Accepts direct Amazon Kinesis writes into table buckets, removing the Lambda compute hop that previously sat between streaming ingest and Iceberg commit.

Why it exists

Operating Iceberg tables on S3 requires managing compaction, snapshot expiry, and metadata cleanup. S3 Tables automates these operations as a built-in S3 feature, reducing operational overhead for lakehouse tables.

Primary use cases

Managed lakehouse tables on S3, zero-ops Iceberg tables, automated compaction and snapshot management, direct Kinesis → Iceberg streaming ingest.

Recent developments

Latest signals
  • Compaction prices cut up to 90% (July 1 2025, automatic). AWS reduced S3 Tables compaction processing fees significantly: per-object price is now 50% lower; per-byte processing prices are 90% lower for binpack compaction and 80% lower for sort + z-order. Reduced prices took effect July 1 2025 and apply automatically in all S3 Tables regions. Per AWS What's New — S3 Tables 90% compaction cost reduction.
  • Three compaction strategies: binpack / sort / z-order. Binpack is the default and combines files for basic optimization; sort organizes data by specific columns to reduce file scans; z-order optimizes for multi-column query patterns. Per AWS — S3 Tables features.
  • 2026 cost reality check after the price cut. Onehouse's follow-up analysis names it "AWS S3 Tables: After the 10× Priceberg Plunge" — the new pricing closes most of the cost gap with self-managed Iceberg but the managed-vs-self-managed math still inverts for the largest tables in some access patterns. Per Onehouse — S3 Tables After the 10× Priceberg Plunge.
  • The earlier 20× surprise that triggered the price cut. Onehouse's original analysis (early 2026) documented operators hitting up to 20× higher costs than expected on managed S3 Tables when compaction processing fees were factored in. The AWS price cut was the response. Per Onehouse — S3 Managed Tables, Unmanaged Costs.
  • Self-managed-vs-managed deep dive for startups. AWS's own builder.aws.com community published a technical deep dive comparing Amazon S3 Tables vs Self-Managed Apache Iceberg on S3 — useful as a vendor-neutral procurement reference. Per AWS builder — S3 Tables vs Self-Managed Iceberg.

Connections 11

Outbound 9
Inbound 2
alternative_to1

Resources 6

Featured in