Technology

Milvus

A distributed vector database built for billion-scale similarity search, using a microservices architecture with SSD caching for hot data and native S3 cold storage offload.

6 connections 2 resources 2 posts

Summary

What it is

A distributed vector database built for billion-scale similarity search, using a microservices architecture with SSD caching for hot data and native S3 cold storage offload.

Where it fits

Milvus is the enterprise-scale vector database for organizations that need to search billions of vectors. Its S3 integration for cold data offload and log-based write-ahead design make it the choice when scale exceeds what single-node vector databases (Qdrant, Weaviate) can handle — at the cost of significantly higher operational complexity.

Misconceptions / Traps

Milvus is a distributed system with significant operational complexity. Running it requires etcd, MinIO or S3, and Pulsar or Kafka — not a single-binary deployment.
S3 is used for persistent storage and log backup, not as a live query tier. Query performance depends on in-memory and SSD-cached segments, not S3 latency.
The microservices architecture enables scaling but introduces failure modes absent in simpler vector databases. Expect to invest in monitoring and operations.

Key Connections

depends_on S3 — uses S3 for persistent object storage of segments and logs
scoped_to Vector Indexing on Object Storage — billion-scale vector search over S3 data
solves Cold Scan Latency — hot vector caching with durable S3 persistence

Definition

What it is

A distributed vector database designed for billion-scale workloads. Uses a microservices architecture with SSD-backed caching for hot data and native offloading of cold vectors to S3-compatible object storage.

Why it exists

Single-node vector databases hit memory and throughput ceilings at billion-vector scale. Milvus distributes the index across a cluster, uses tiered storage to keep hot vectors in memory while parking cold vectors on S3, and scales compute independently from storage.

Primary use cases

Billion-scale embedding search for enterprise RAG systems, tiered vector storage with S3 cold tier, distributed similarity search across massive embedding collections.

Recent developments

Latest signals

Latest release: v2.6.18 (current as of June 2026). GA on the 2.6 stable line (the 3.0.0 line remains beta); adds nullable vector support and element-level search on Struct fields. Tracking the upstream stable release line. Per milvus-io/milvus releases.
Milvus v2.6.15 (April 24, 2026) — focus on stability + recovery time. Per the milvus-io/milvus releases page, v2.6.15 ships faster MixCoord recovery, optimized search and query filter performance, and 20+ bug fixes. Go runtime upgraded to 1.25.8 for CVE-driven base-image refresh. The release pattern — frequent point releases focused on operational shapes — signals Milvus is now firmly in "production-hardening" mode rather than "feature land-grab."
Benchmark positioning: 6ms p50 latency on 1M-vector workloads. Per the SaltTechno vector DB benchmark (February 2026) — 1M vectors, 1536 dim — Milvus posts 6ms p50 / 35ms p99 latency with 8 index algorithms (including GPU). The HolySheep showdown vs Qdrant + Weaviate (April 2026) measures Milvus 2.4 at cold p50 28ms / p99 145ms, warm p50 9ms / p99 31ms, with bulk-insert of 1M vectors in 4m 12s and a 99.82% success rate under sustained 1000 QPS. Source mix note: these are third-party benchmarks rather than official Milvus measurements; treat them as positioning data, not load-bearing performance contracts.