The Whole Stack Went Open — Weights, Storage, and Sovereignty

This index has tracked Vendor Lock-In since its first commit. It's the pain point that explains more architecture decisions than any benchmark: why teams self-host MinIO instead of renting S3, why OpenDAL exists, why Egress Cost and Data Residency keep showing up in procurement reviews. Lock-in is the disease. Openness is the cure.

What changed in the second quarter of 2026 is that the cure stopped being a storage story and became a whole-stack story — open weights, open storage, open sovereignty — and the teams shipping it fastest are mostly not American. That's not a slogan. It's a pattern you can read straight off this quarter's releases, and there's now a peer-reviewed argument for why it's happening.

Signal 1 — the model went open, and "good enough" stopped needing the frontier

DeepSeek V4 preview (April 24, 2026) scaled the DeepSeekMoE sparse architecture to 1.6 trillion parameters (49B active) in V4-Pro, plus a 284B V4-Flash — both open weights, making V4-Pro the largest open-weights model anyone can download.1 It posts 80.6% on SWE-bench Verified, 93.5% on LiveCodeBench, a Codeforces rating of 3206, and a 1-million-token context window2 — open weights now landing squarely on top of last year's closed frontier.

The closed frontier did not wait to be caught. Six weeks later Anthropic shipped Claude Fable 5 (June 9, 2026) — state-of-the-art on nearly every benchmark, compressing a 50-million-line Ruby migration from two months to a single day — at $10 / $50 per million tokens.3 So the gap is a treadmill, not a closing door: open weights match the frontier-of-twelve-months-ago while the frontier sprints ahead again. The honest read isn't "open caught up." It's that the two are pulling apart — and for the overwhelming majority of object-storage-adjacent work, the frontier was never the point. A model you can run behind your own Data Residency boundary, one generation back, at a fraction of the cost, is not a compromise — it's a procurement win. (We follow that split all the way down in The Frontier and the Floor.)

Signal 2 — the storage layer went open too

The model is the headline; the storage layer is the part this index cares about, and it moved in the same direction. DeepSeek 3FS — the Fire-Flyer File System that feeds DeepSeek's training and inference — is open source, an NVMe + RDMA distributed file system with published throughput of 6.6 TB/s on a 180-node cluster, and a KV-cache mode that sidesteps DRAM cost for inference serving.4 The same vertical bet that produced an open model produced an open storage substrate underneath it. You can adopt the data plane without adopting the vendor.

And the commodity end of object storage kept going open and going sovereign:

  • Europe stood up sovereign S3. Cubbit and Worldstream launched independent, sovereign S3 storage aimed squarely at organizations that need data to stay on home soil and out of reach of the CLOUD Act — geo-distributed, EU-operated, S3-compatible.5
  • The post-MinIO self-hosted stack matured. After MinIO narrowed its open-source posture, the gap was filled by genuinely open alternatives — versitygw, SeaweedFS, Garage, and RustFS — now a credible self-hosted S3 tier for teams that refuse to depend on a single vendor's roadmap.6
  • Independent providers competed on zero lock-in, not on a walled garden: Wasabi and Hetzner keep winning on flat, egress-free pricing — the direct economic answer to Egress Cost.

Signal 3 — the closed strategy may have caused the open surge

Here's the part that ties it together, and it's not opinion — it's the thesis of a 2026 paper: U.S. export controls and a closed-model strategy unintentionally accelerated China's open AI ecosystem.7 The mechanism is clean. Restricting the export of frontier compute and weights raised the strategic value of open, locally-adaptable systems — and Chinese developers responded by engaging open-source LLM repositories substantially more than U.S. developers did, while China embedded open-source AI into national technology strategy through standards coordination and resilience-oriented deployment.7

The controls landed asymmetrically in another way too. You can't meaningfully export-control an open-weights model — it's a torrent, not a shipment.7 So the policy that was meant to preserve an American lead instead handed the open ecosystem its strongest selling point: you cannot be cut off from something nobody controls. Meanwhile analysts at Brookings and Chatham House warn that broad controls risk impairing AI research at U.S. universities — kneecapping exactly the open, publishable research that built the lead in the first place.89

There's a domestic mirror to this. The American storage-and-data headlines of the quarter were about consolidation and capital, not openness: ClickHouse crossed $250M ARR and bought its way into LLM observability; Databricks absorbed Tabular and keeps the lakehouse governance story inside Unity Catalog. Great businesses — but the gravity is toward moats. The open-stack gravity is elsewhere.

What this means if you're building on S3

  1. Openness is now a full-stack property, not a storage checkbox. You can run an open model on an open file system behind a sovereign object store and owe no single vendor a roadmap dependency. Two years ago that sentence had a hole in it at the model layer. DeepSeek V4 closed it.
  2. Sovereignty is the new lock-in axis. Data Residency, CLOUD Act Data Access, and China Data Localization are no longer compliance footnotes — they're the reason the fastest-growing object-storage offerings are regional and open, not global and proprietary.
  3. Don't confuse "the frontier moved" with "the floor matters less." The closed frontier is still first — Fable 5 made sure of that. But the frontier and the floor are now diverging: the hardest autonomous work goes to the closed, expensive top; the high-volume commodity inference that actually runs your pipelines, validators, and catalog tooling runs open, ~50× cheaper, and on your own soil. That's two tiers and two separate procurement decisions, not one race.

In 2024 the open argument was about cost and control at the storage layer. In 2026 it runs the whole height of the stack — weights, file system, and jurisdiction — and the teams executing it hardest are the ones that the closed strategy was supposed to hold back.


Footnotes

  1. DeepSeek V4 preview — 1.6T-parameter V4-Pro + 284B V4-Flash, open weights, frontier-adjacent capability at a fraction of access cost — DeepSeek V4 (DataCamp).

  2. DeepSeek V4-Pro benchmarks — 80.6% SWE-bench Verified, 93.5% LiveCodeBench, Codeforces 3206, 1M-token context — DeepSeek V4-Pro Complete Guide.

  3. Claude Fable 5 + Mythos 5 (June 9, 2026) — state-of-the-art across benchmarks, $10/M input + $50/M output, claude-fable-5; Stripe's 50M-line Ruby migration compressed from two months to one day — Anthropic — Claude Fable 5 and Mythos 5.

  4. DeepSeek 3FS (Fire-Flyer File System) — open-source NVMe + RDMA distributed FS, 6.6 TB/s aggregate read on a 180-node cluster, inference KV-cache mode — deepseek-ai/3fs (GitHub).

  5. Cubbit + Worldstream launch independent, sovereign S3 cloud storage — Worldstream press.

  6. The post-MinIO open self-hosted S3 landscape — versitygw, SeaweedFS, Garage, RustFS, Ceph — Self-Hosted S3 Storage in 2026: RustFS, SeaweedFS, Garage, or Ceph?, and Self-hosted S3 after MinIO: lightweight alternatives for 2026.

  7. U.S. policies unintentionally accelerated China's open AI ecosystems — export controls raised the strategic value of open, locally-adaptable systems; Chinese developers engaged open repositories substantially more than U.S. developers; open weights are not meaningfully export-controllable — arXiv:2606.15999. 2 3

  8. Broad AI export controls risk impairing research at U.S. universities — Brookings — The tension between AI export control and U.S. AI innovation.

  9. AI export controls as a blunt instrument — Chatham House — AI export controls are not the best bargaining chip.