Architecture

Geo-Dispersed Erasure Coding

An erasure coding scheme that distributes data fragments and parity blocks across geographically separated sites, providing durability and data locality at lower storage overhead than full replication.

5 connections 3 resources

Summary

What it is

An erasure coding scheme that distributes data fragments and parity blocks across geographically separated sites, providing durability and data locality at lower storage overhead than full replication.

Where it fits

Geo-dispersed erasure coding extends the durability model of object storage beyond a single data center. Instead of replicating full copies to each site (3x overhead), data is erasure-coded across sites (typically 1.2-1.5x overhead) while maintaining the ability to reconstruct from any subset of sites.

Misconceptions / Traps
  • Geo-dispersed erasure coding increases read latency. Reconstruction requires fetching fragments from multiple geographic sites, adding network round-trip time to every read.
  • Failure domain is now geographic. If too many sites are unreachable simultaneously (beyond the erasure code's tolerance), data becomes temporarily unavailable — unlike multi-copy replication where any single copy suffices.
Key Connections
  • solves Rebuild Window Risk — erasure coding across sites reduces single-site vulnerability
  • constrained_by Repair Bandwidth Saturation — cross-site repair consumes WAN bandwidth
  • scoped_to Object Storage, Geo / Edge Object Storage

Definition

What it is

An erasure coding scheme that distributes data fragments across multiple geographic sites, so that any configurable subset of sites can reconstruct the full object. Provides both durability and data locality across regions.

Why it exists

Traditional replication (3 copies across 3 AZs) is expensive. Geo-dispersed erasure coding achieves equivalent or better durability at lower storage overhead (typically 1.2-1.5x vs. 3x for replication) while keeping data fragments close to multiple compute locations.

Primary use cases

Multi-region durable storage with low overhead, cross-site data availability, disaster-resilient object storage.

Connections 5

Outbound 5

Resources 3