Technology

LanceDB

Summary

What it is

A vector database that stores data in the Lance columnar format directly on object storage. Designed for serverless vector search without a separate index server.

Where it fits

LanceDB is the S3-native option for vector search. Unlike Milvus or Pinecone, LanceDB stores both raw data and vector indexes as files on S3 — aligning with the separation of storage and compute principle and eliminating a separate infrastructure layer.

Misconceptions / Traps

  • Serverless on S3 means higher query latency than in-memory vector databases. LanceDB trades latency for simplicity and cost.
  • LanceDB uses the Lance format, not Parquet. Data must be converted or ingested into Lance format for vector search.

Key Connections

  • indexes MinIO, AWS S3 — builds vector indexes over S3-stored data
  • implements Hybrid S3 + Vector Index — the canonical implementation of this pattern
  • scoped_to Vector Indexing on Object Storage, S3

Definition

What it is

A vector database that stores data in the Lance columnar format directly on object storage. Designed for serverless vector search without requiring a separate vector index server.

Why it exists

Traditional vector databases require dedicated infrastructure. LanceDB stores both raw data and vector indexes as files on S3, aligning with the separation of storage and compute principle and avoiding a separate operational layer for vector search.

Primary use cases

Semantic search over S3-stored documents, retrieval-augmented generation (RAG) with S3-backed corpora, serverless vector search.

Relationships

Outbound Relationships

Resources