Architecture

Online Embedding Refresh Pipeline

A continuous pipeline that regenerates vector embeddings as source data in S3 changes, keeping vector indexes in sync with the latest content without full re-embedding.

5 connections 2 resources

Summary

What it is

A continuous pipeline that regenerates vector embeddings as source data in S3 changes, keeping vector indexes in sync with the latest content without full re-embedding.

Where it fits

This pattern solves the stale-embedding problem in RAG and semantic search systems. When S3 objects are created, updated, or deleted, the pipeline detects changes (via S3 events), re-embeds affected content, and updates the vector index — maintaining search accuracy.

Misconceptions / Traps

"Online" does not mean real-time for all practical purposes. Pipeline latency depends on event processing, embedding model inference time, and index update propagation. Minutes-level latency is typical.
Change detection at scale is not trivial. S3 event notifications can lose events under high throughput. Consider combining event-driven and periodic full-scan reconciliation.

Key Connections

depends_on Embedding Model — requires an embedding model for re-vectorization
solves Hybrid S3 + Vector Index drift — keeps embeddings in sync with source data
constrained_by High Cloud Inference Cost — continuous embedding has ongoing cost
scoped_to Vector Indexing on Object Storage, LLM-Assisted Data Systems

Definition

What it is

A continuous or near-real-time pipeline that detects changes in S3-stored source data, regenerates affected embeddings, and updates vector indexes — keeping semantic search results fresh without full re-indexing.

Why it exists

Offline batch embedding pipelines create stale vector indexes. For applications where data changes frequently (knowledge bases, product catalogs), continuous embedding refresh ensures search results reflect the latest content.

Primary use cases

Near-real-time RAG index updates, continuous product catalog embedding, fresh knowledge base vectorization.

Connections 5

Outbound 5

scoped_to3

S3 LLM-Assisted Data Systems Vector Indexing on Object Storage

depends_on1

Embedding Model

constrained_by1

High Cloud Inference Cost

Resources 2

BlogHigh

aws.amazon.com/blogs/big-data/generate-vector-embeddings-for...

AWS Big Data Blog showing a Lambda-based embedding refresh pipeline that processes S3 events to keep vector indexes current.

GitHubHigh

github.com/aws-samples/text-embeddings-pipeline-for-rag

AWS sample repository providing a complete pipeline for continuous embedding generation from S3-stored documents.