Milvus
A distributed vector database built for billion-scale similarity search, using a microservices architecture with SSD caching for hot data and native S3 cold storage offload.
Summary
A distributed vector database built for billion-scale similarity search, using a microservices architecture with SSD caching for hot data and native S3 cold storage offload.
Milvus is the enterprise-scale vector database for organizations that need to search billions of vectors. Its S3 integration for cold data offload and log-based write-ahead design make it the choice when scale exceeds what single-node vector databases (Qdrant, Weaviate) can handle — at the cost of significantly higher operational complexity.
- Milvus is a distributed system with significant operational complexity. Running it requires etcd, MinIO or S3, and Pulsar or Kafka — not a single-binary deployment.
- S3 is used for persistent storage and log backup, not as a live query tier. Query performance depends on in-memory and SSD-cached segments, not S3 latency.
- The microservices architecture enables scaling but introduces failure modes absent in simpler vector databases. Expect to invest in monitoring and operations.
depends_onS3 — uses S3 for persistent object storage of segments and logsscoped_toVector Indexing on Object Storage — billion-scale vector search over S3 datasolvesCold Scan Latency — hot vector caching with durable S3 persistence
Definition
A distributed vector database designed for billion-scale workloads. Uses a microservices architecture with SSD-backed caching for hot data and native offloading of cold vectors to S3-compatible object storage.
Single-node vector databases hit memory and throughput ceilings at billion-vector scale. Milvus distributes the index across a cluster, uses tiered storage to keep hot vectors in memory while parking cold vectors on S3, and scales compute independently from storage.
Billion-scale embedding search for enterprise RAG systems, tiered vector storage with S3 cold tier, distributed similarity search across massive embedding collections.