For most of the lakehouse era the catalog was an afterthought — a place to remember where the tables live. Hive Metastore was a Thrift service over a relational backend. Delta Lake kept its state in a transaction log inside the data files. Iceberg scattered metadata files across object storage. In every case the catalog was a sidecar to the table format: necessary, but architecturally subordinate, and assumed to be fast enough.
In May 2026 three signals converged to retire that assumption. The catalog is becoming a database — not a config file, not a metadata appendix, but a first-class system with its own query language, concurrency control, storage format, and API. This is the companion shift to the cost inversion described in When Inference Became Cheaper Than Storage: when storage gets distributed and inference runs continuously, the catalog becomes the coordination layer, and it has to be as fast as the storage it coordinates.
The pain point underneath this entire wave is Metadata Overhead — the failure mode where the metadata layer, not the data layer, is what's limiting your throughput.
Why the existing catalogs ran out of road
The TreeCat research paper out of the University of Maryland (PVLDB 2025) opens with a claim that sounds obvious once stated and that nobody had built for:
"In this context, the catalog itself can be seen as an independent database (sub-)system specialized for handling metadata operations requested by various database engines. As such, the standalone catalog has its own set of functional and systemic requirements, which are not adequately addressed by existing solutions."1
Hold each incumbent against that standard and the gap is structural, not incidental:
| Catalog | Model | Where it breaks |
|---|---|---|
| Hive Metastore | Thrift service over a relational DB | single point of contention, coarse locking, never designed for high-concurrency metadata |
| Delta Lake | transaction log in Parquet | tied to the table format, commit latency grows with table size |
| Apache Iceberg | metadata files in object storage | object-store latency caps commit throughput; no fine-grained concurrency |
The deeper insight is about shape. Metadata is inherently hierarchical — databases contain tables, tables contain columns, columns carry statistics, partitions contain files, files have metrics. That's a tree. The incumbents flatten it into relational rows (HMS) or manifest files (Iceberg), then pay to reconstruct the hierarchy on every read. TreeCat's bet is to stop flattening: adopt a hierarchical data model with a path-navigation query language, "similar to XPath," that maps directly onto how metadata is actually structured.1
Signal 1 — TreeCat: the catalog as a standalone engine
TreeCat is a single-server C++ database engine — roughly 12,000 lines — built specifically to serve metadata. It has the parts you'd expect of a real database, not a config store: a gRPC backend, a query executor with batch correlated scans, a transaction manager running multi-version optimistic concurrency control (MVOCC), and a RocksDB-backed storage layer using a BSON data format.2
The design choices all point the same direction — treat metadata operations as a first-class workload:
- Write-optimized layout (RocksDB LSM-tree) for high-ingest metadata churn.
- Versioned storage (MVCC) for time travel, cloning, and snapshots — the same primitives the table formats expose, but at the catalog layer.
- MVOCC concurrency: serializable isolation without coarse locks, the first catalog-specific concurrency-control mechanism rather than a generic protocol retrofitted onto metadata. It combines scan-range and precision locking to manage predicate dependencies, and defers updates to commit time so that frequently-updated fields like statistics don't trigger aborts.3
Benchmarked against HMS, Delta Lake, and Iceberg on TPC-DS at 100TB scale, the results land where the architecture predicts: under a 30-thread write-intensive workload TreeCat sustains the highest throughput at the lowest abort rate, while lock-based schemes hit 31–64% abort rates or stall on contention. On schema evolution — alterTable throughput — TreeCat posts the lowest latency of the four; Iceberg the highest.4 A catalog built like a database beats catalogs built like file conventions, at the one job catalogs exist to do.
Signal 2 — StarTree: the query engine that needs a fast catalog
StarTree Cloud, built on Apache Pinot, became the first system to offer low-latency serving directly on Iceberg tables — no explicit ingestion, no conversion to a proprietary segment format, no shadow copy.5 This is the "reverse ETL is dead" thesis in production: one physical layout, no duplication between lakehouse and serving tier.
How it hits sub-second latency is the interesting part, because every step depends on the catalog:
- Pinot reads Iceberg partition metadata and column statistics to prune segments before any remote read.
- Bloom filters eliminate irrelevant segments.
- An index-first architecture maps predicates to Parquet pages, not whole files.
- A custom Parquet reader prefetches and decompresses in parallel.
- Hierarchical caching separates the data cache from the index cache.
The headline result — sub-second P99 at 500 QPS on representative lakehouse workloads, benchmarked against production traffic at Stripe, DoorDash, and Cisco.5 But notice steps 1 and 3: the pruning that lets Pinot avoid object-store I/O entirely happens in the catalog. If the catalog is slow — HMS with Thrift overhead, or Iceberg with object-store latency on the metadata path — the query engine cannot reach its sub-second promise no matter how good its reader is. The catalog is the coordination bottleneck that decides whether the query engine performs. Which is precisely the problem TreeCat was built to remove.
Signal 3 — Weaviate MCP: agents as catalog consumers
Weaviate v1.37.0 (April 16, 2026) shipped a built-in MCP server at /v1/mcp — the first major vector database to integrate the Model Context Protocol natively.6 An agent can now introspect schema, query vectors with natural-language tool descriptions, and create or delete collections autonomously, with no human-written integration code.
That looks like a vector-DB feature, but it's a catalog story. When agents touch a vector database, they aren't only reading vectors — they're discovering schema at runtime, reading metadata about embedding models and distance metrics, writing metadata as they create collections, and coordinating with other agents over shared structures. That is the catalog workload — metadata operations from many heterogeneous clients — except the clients are now autonomous systems issuing requests at machine speed, not analysts writing SQL. The catalog has to serve them with sub-second latency, which is the same bar StarTree's query path demands and the same bar TreeCat's engine was built to clear.
The convergence
Three patterns, one direction:
- Disaggregation. The 2024 model fused the catalog into the table format or hid it behind a simple service. The 2026 model breaks it out as an independent engine — TreeCat explicitly argues for "disaggregating the catalog functionality for higher composability."
- The catalog is the query engine's bottleneck. Sub-second serving (StarTree) is achievable only if metadata pruning happens before any data read — which it can only do if the catalog is fast.
- Agents need catalog access, not just data access. MCP-native infrastructure (Weaviate) means the catalog now serves autonomous clients at runtime, not batch ETL on a schedule.
What this means for object storage
If the catalog is a database, it needs storage characteristics the data lake doesn't provide. The lakehouse splits cleanly into layers with different demands:
- Data layer — S3 / object storage, optimized for cheap, durable, high-throughput bulk reads.
- Catalog layer — local, low-latency, high-IOPS, strongly consistent. TreeCat uses RocksDB on local storage precisely because a catalog serving sub-second queries cannot live on S3. Object-store latency is fine for bulk data and fatal for the metadata coordination path.
- Query layer — Pinot/StarTree (or a native lakehouse engine) coordinating between the two.
- Agent layer — MCP-native vector and metadata services consumed at machine speed.
Two reframes fall out of this directly. First, the small files problem is really a catalog problem, not a storage problem. Billions of small objects only create a bottleneck when a centralized, coarsely-locked catalog (HMS) has to track them; a distributed, hierarchical catalog absorbs the file count without the same contention. When Tigris advertises "billions of small files," what makes that possible is the catalog layer being architected for it. Second, S3 becomes the archive, not the coordinator — it serves bytes, the catalog serves metadata, and conflating the two is what made the old stack slow.
Metadata's independence day
In 2024 the catalog was a sidecar — a slow relational service or a pile of files next to the data. In 2026 three independent signals say otherwise: TreeCat proves catalog-specific engines outperform generic ones; StarTree proves query engines need fast catalogs to keep their latency promises; Weaviate's MCP server proves agents need catalog access, not just data access.
The catalog is no longer part of the table format. It is an independent database layer — its own query language, concurrency model, storage format, and API — serving query engines, agents, and pipelines alike, all at sub-second latency. The lakehouse is no longer "data in S3 plus a table format on top." It is four layers, and the one in the middle just became the most interesting database in the stack.
It's about time.
Footnotes
-
TreeCat: a standalone catalog engine — arXiv:2503.02956, Section 1; source at github.com/umddb/treecat. ↩ ↩2
-
TreeCat architecture and storage layout — arXiv:2503.02956, Sections 4–5. ↩
-
The MVOCC protocol — arXiv:2503.02956, Section 4.4. ↩
-
Experimental evaluation vs HMS, Delta Lake, Iceberg (TPC-DS 100TB) — arXiv:2503.02956, Section 7. ↩
-
StarTree low-latency serving on Iceberg with Apache Pinot — startree.ai; see also Apache Pinot in 2026. ↩ ↩2
-
Weaviate v1.37.0 MCP server — Weaviate MCP docs; Vector Database News, April 2026. ↩