Architecture

Hybrid Retrieval

A retrieval pattern that combines **dense vector similarity** (semantic search via embeddings) with **sparse lexical search** (BM25 over an inverted index), merges the two ranked result sets using **Reciprocal Rank Fusion (RRF)**, and passes the fused candidates through a **cross-encoder reranker** for a high-precision final pass. The output: a small, deeply-relevant context set for the LLM, anchored both semantically and lexically.

8 connections 1 post

Definition

What it is

A retrieval pattern that combines **dense vector similarity** (semantic search via embeddings) with **sparse lexical search** (BM25 over an inverted index), merges the two ranked result sets using **Reciprocal Rank Fusion (RRF)**, and passes the fused candidates through a **cross-encoder reranker** for a high-precision final pass. The output: a small, deeply-relevant context set for the LLM, anchored both semantically and lexically.

Why it exists

Pure vector search excels at concept matching but fails at exact-match scenarios — product SKUs, legal clause numbers, specific API names, regulatory references. Pure BM25 is the inverse: rock-solid on exact-match, blind to paraphrase. 2026 production retrieval systems run both in parallel and fuse them because either signal alone leaks recall in the cases the other handles best.

Primary use cases

Enterprise RAG over regulated corpora (financial filings, legal contracts, medical records), code-aware retrieval where identifier-level precision matters, agentic memory systems requiring verifiable provenance against retrieved chunks, search over technical documentation with high jargon density.

Connections 8

Outbound 7
Inbound 1
enables1

Featured in