Multi-Site Replication and Geo-Distributed Object Storage
Problem Framing
Operating object storage across multiple geographic sites introduces problems that single-site deployments never encounter. Engineers deploy multi-site storage for disaster recovery, data sovereignty compliance, edge data aggregation, or global access latency reduction — but each use case demands a different replication topology. Active-active replication with conflict resolution is fundamentally different from edge-to-core aggregation, and geo-dispersed erasure coding creates constraints that no single-site erasure scheme faces. Choosing wrong means data loss during site failures, unbounded replication lag, or repair bandwidth that saturates inter-site links.
Relevant Nodes
- Topics: Geo / Edge Object Storage
- Technologies: Garage, Dell ECS, NetApp StorageGRID, MinIO, Ceph, AWS S3
- Standards: CRDT
- Architectures: Active-Active Multi-Site Object Replication, Edge-to-Core Object Aggregation, Geo-Dispersed Erasure Coding
- Pain Points: Geo-Replication Conflict / Divergence, Rebuild Window Risk, Repair Bandwidth Saturation
Decision Path
Choose your replication topology based on the use case:
- Active-active multi-site: Both (or all) sites accept writes and replicate to each other. Use when applications at each site need local read/write access. Requires conflict resolution. Technologies: MinIO multi-site replication, Dell ECS geo-replication, Ceph multi-site RGW.
- Active-passive (DR): One site is primary, others are read-only replicas. Use for disaster recovery where RPO/RTO requirements are defined. Simpler — no conflict resolution. Technologies: AWS S3 Cross-Region Replication, MinIO bucket replication.
- Edge-to-core aggregation: Many edge sites write data locally and replicate to a central data lake. One-way flow. Use when edge devices generate data (IoT, retail, manufacturing) that needs centralized analysis. Technologies: MinIO multi-site, custom S3-to-S3 sync pipelines.
Handle conflict resolution:
- Active-active replication must resolve concurrent writes to the same key. There are two approaches:
- Last-writer-wins (LWW): Simple, deterministic, but silently drops concurrent writes. AWS S3 CRR and MinIO use this model.
- CRDT-based resolution: Conflict-free Replicated Data Types guarantee convergence without data loss. Garage uses CRDTs for its metadata layer. More complex but preserves all writes.
- If your workload is append-only (log data, sensor readings, backups), conflicts are rare and LWW is safe. If objects are updated in place from multiple sites, you need CRDT or application-level conflict handling.
- Active-active replication must resolve concurrent writes to the same key. There are two approaches:
Plan for rebuild and repair bandwidth:
- When a site goes offline and comes back, it must replay missed replication. This replay consumes inter-site bandwidth and competes with live replication traffic.
- Rebuild Window Risk: If a site is offline too long, the replication backlog may exceed the bandwidth available to catch up. Define maximum tolerable offline duration.
- Repair Bandwidth Saturation: Geo-dispersed erasure coding requires repair traffic across WAN links. Budget inter-site bandwidth explicitly: live traffic + repair traffic + headroom.
Evaluate geo-dispersed erasure coding:
- Standard erasure coding (e.g., 4+2 within a single site) provides durability against disk/node failures.
- Geo-dispersed erasure coding distributes fragments across sites, surviving entire site failures. But: every read may require fragments from multiple sites (higher latency), and repair after site failure moves large volumes across WAN links.
- Use geo-dispersed erasure coding when cross-site durability is required and latency tolerance allows it. Do not use it for latency-sensitive hot data.
Choose between vendor-managed and self-managed replication:
- AWS S3 CRR (Cross-Region Replication): Zero operational overhead, but limited to AWS regions. Supports same-account and cross-account replication. No active-active (it is one-directional).
- Self-managed (MinIO, Ceph, Dell ECS, StorageGRID): Full control over topology, bandwidth allocation, and conflict policy. Higher operational cost. Required for on-premise multi-site.
- Garage: Designed from the ground up for geo-distribution. Lightweight, CRDT-based, works on heterogeneous hardware across unreliable links.
What Changed Over Time
- Early multi-site object storage was limited to enterprise products (EMC Atmos, NetApp StorageGRID) with proprietary replication protocols.
- AWS S3 Cross-Region Replication (2015) made multi-region object storage accessible but only within AWS.
- MinIO added multi-site replication, bringing active-active capabilities to self-hosted S3-compatible storage.
- Garage introduced CRDT-based metadata replication (2022+), addressing the conflict resolution problem for geo-distributed edge deployments without heavy coordination.
- The rise of data sovereignty regulations (GDPR, data residency laws) has made multi-site replication a compliance requirement, not just a DR strategy.
Sources
- docs.aws.amazon.com/AmazonS3/latest/userguide/replication.html
- min.io/docs/minio/linux/operations/concepts/multi-site-replication.htm...
- docs.ceph.com/en/latest/radosgw/multisite/
- garagehq.deuxfleurs.fr/
- crdt.tech/
- min.io/docs/minio/linux/operations/concepts/erasure-coding.html
- docs.netapp.com/us-en/storagegrid/index.html
- aws.amazon.com/blogs/storage/optimizing-storage-for-edge-computing-wit...