Guide 11

Multi-Site Replication and Geo-Distributed Object Storage

Problem Framing

Operating object storage across multiple geographic sites introduces problems that single-site deployments never encounter. Engineers deploy multi-site storage for disaster recovery, data sovereignty compliance, edge data aggregation, or global access latency reduction — but each use case demands a different replication topology. Active-active replication with conflict resolution is fundamentally different from edge-to-core aggregation, and geo-dispersed erasure coding creates constraints that no single-site erasure scheme faces. Choosing wrong means data loss during site failures, unbounded replication lag, or repair bandwidth that saturates inter-site links.

Relevant Nodes

  • Topics: Geo / Edge Object Storage
  • Technologies: Garage, Dell ECS, NetApp StorageGRID, MinIO, Ceph, AWS S3
  • Standards: CRDT
  • Architectures: Active-Active Multi-Site Object Replication, Edge-to-Core Object Aggregation, Geo-Dispersed Erasure Coding
  • Pain Points: Geo-Replication Conflict / Divergence, Rebuild Window Risk, Repair Bandwidth Saturation

Decision Path

  1. Choose your replication topology based on the use case:

    • Active-active multi-site: Both (or all) sites accept writes and replicate to each other. Use when applications at each site need local read/write access. Requires conflict resolution. Technologies: MinIO multi-site replication, Dell ECS geo-replication, Ceph multi-site RGW.
    • Active-passive (DR): One site is primary, others are read-only replicas. Use for disaster recovery where RPO/RTO requirements are defined. Simpler — no conflict resolution. Technologies: AWS S3 Cross-Region Replication, MinIO bucket replication.
    • Edge-to-core aggregation: Many edge sites write data locally and replicate to a central data lake. One-way flow. Use when edge devices generate data (IoT, retail, manufacturing) that needs centralized analysis. Technologies: MinIO multi-site, custom S3-to-S3 sync pipelines.
  2. Handle conflict resolution:

    • Active-active replication must resolve concurrent writes to the same key. There are two approaches:
      • Last-writer-wins (LWW): Simple, deterministic, but silently drops concurrent writes. AWS S3 CRR and MinIO use this model.
      • CRDT-based resolution: Conflict-free Replicated Data Types guarantee convergence without data loss. Garage uses CRDTs for its metadata layer. More complex but preserves all writes.
    • If your workload is append-only (log data, sensor readings, backups), conflicts are rare and LWW is safe. If objects are updated in place from multiple sites, you need CRDT or application-level conflict handling.
  3. Plan for rebuild and repair bandwidth:

    • When a site goes offline and comes back, it must replay missed replication. This replay consumes inter-site bandwidth and competes with live replication traffic.
    • Rebuild Window Risk: If a site is offline too long, the replication backlog may exceed the bandwidth available to catch up. Define maximum tolerable offline duration.
    • Repair Bandwidth Saturation: Geo-dispersed erasure coding requires repair traffic across WAN links. Budget inter-site bandwidth explicitly: live traffic + repair traffic + headroom.
  4. Evaluate geo-dispersed erasure coding:

    • Standard erasure coding (e.g., 4+2 within a single site) provides durability against disk/node failures.
    • Geo-dispersed erasure coding distributes fragments across sites, surviving entire site failures. But: every read may require fragments from multiple sites (higher latency), and repair after site failure moves large volumes across WAN links.
    • Use geo-dispersed erasure coding when cross-site durability is required and latency tolerance allows it. Do not use it for latency-sensitive hot data.
  5. Choose between vendor-managed and self-managed replication:

    • AWS S3 CRR (Cross-Region Replication): Zero operational overhead, but limited to AWS regions. Supports same-account and cross-account replication. No active-active (it is one-directional).
    • Self-managed (MinIO, Ceph, Dell ECS, StorageGRID): Full control over topology, bandwidth allocation, and conflict policy. Higher operational cost. Required for on-premise multi-site.
    • Garage: Designed from the ground up for geo-distribution. Lightweight, CRDT-based, works on heterogeneous hardware across unreliable links.

What Changed Over Time

  • Early multi-site object storage was limited to enterprise products (EMC Atmos, NetApp StorageGRID) with proprietary replication protocols.
  • AWS S3 Cross-Region Replication (2015) made multi-region object storage accessible but only within AWS.
  • MinIO added multi-site replication, bringing active-active capabilities to self-hosted S3-compatible storage.
  • Garage introduced CRDT-based metadata replication (2022+), addressing the conflict resolution problem for geo-distributed edge deployments without heavy coordination.
  • The rise of data sovereignty regulations (GDPR, data residency laws) has made multi-site replication a compliance requirement, not just a DR strategy.

Sources