Standard

S3 API

Summary

What it is

The HTTP-based API for object storage operations — PUT, GET, DELETE, LIST, multipart upload. The de-facto standard for object storage interoperability.

Where it fits

The S3 API is the protocol layer that makes the entire ecosystem possible. Every object storage server (MinIO, Ceph, Ozone), every compute engine (Spark, DuckDB, Trino), and every table format operates against this API.

Misconceptions / Traps

  • The S3 API is not formally standardized by any standards body. It is a de-facto standard defined by AWS's implementation. Compatibility varies across providers.
  • LIST is paginated at 1,000 objects per request with no server-side filtering beyond prefix. This is a fundamental performance constraint, not a configuration issue.

Key Connections

  • enables Lakehouse Architecture, Separation of Storage and Compute — the interface that makes decoupled architectures possible
  • solves Vendor Lock-In — as a de-facto interoperability standard across providers
  • AWS S3, MinIO, Ceph, Apache Ozone implements S3 API — concrete implementations
  • scoped_to S3

Note: Pain points Object Listing Performance, Lack of Atomic Rename, and S3 Consistency Model Variance reference S3 API as their origin in their definitions, but no formal edges connect S3 API to those pain points.

Definition

What it is

The HTTP-based API for object storage operations — PUT, GET, DELETE, LIST, multipart upload, and related operations on buckets and objects. The de-facto standard for object storage interoperability.

Why it exists

Amazon defined this API for AWS S3. Because of S3's dominance, the API became the common interface that all other object storage systems implement, enabling a portable ecosystem of tools and libraries.

Primary use cases

Object CRUD operations, multipart uploads for large files, bucket-level access control, presigned URLs for temporary access.

Relationships

Resources