Semantic Search
Querying S3-derived vector embeddings to find content by meaning rather than exact keyword match.
Summary
Querying S3-derived vector embeddings to find content by meaning rather than exact keyword match.
Semantic search is the retrieval layer that makes LLMs useful over S3 data. It powers the "R" in RAG — finding the most relevant S3-stored documents for a given query without requiring exact keyword matches.
- Semantic search is approximate, not exact. Results are ranked by similarity score, not matched precisely. False positives are possible and must be handled.
- Semantic search requires embedding generation as a prerequisite. You cannot search semantically without first vectorizing the S3 data.
depends_onEmbedding Model — needs vectors to searchenablesHybrid S3 + Vector Index — the retrieval mechanism for the patternaugmentsLakehouse Architecture — adds semantic retrieval to structured datascoped_toLLM-Assisted Data Systems, Vector Indexing on Object Storage
Definition
The ability to query S3-derived vector embeddings to find content by meaning rather than exact keyword match.
S3 objects cannot be searched by content natively. Semantic search, built on embeddings generated from S3 data, allows users to find relevant documents, records, or media by describing what they need in natural language.
Document retrieval for RAG over S3 data, knowledge discovery in S3-stored archives, content recommendation from S3-backed media libraries.
Connections 8
Outbound 5
depends_on1enables1augments1Inbound 3
enables2augments1Resources 3
AWS announcement of S3 Vectors, the first cloud object store with native vector search, enabling sub-second semantic queries over billions of embeddings.
Official OpenSearch documentation for semantic search using vector embeddings, the primary open-source engine used alongside S3-backed vector stores.
AWS Big Data Blog showing how S3 Vectors integrates with OpenSearch Service for hybrid semantic + keyword search.