Architecture

Edge-to-Core Object Aggregation

A one-way replication pattern where data collected at edge S3-compatible storage nodes is continuously replicated to a central S3 data lake for durable storage, analytics, and processing.

4 connections 2 resources

Summary

What it is

A one-way replication pattern where data collected at edge S3-compatible storage nodes is continuously replicated to a central S3 data lake for durable storage, analytics, and processing.

Where it fits

Edge-to-core aggregation is the data flow pattern for IoT, retail, and distributed organizations. Edge nodes provide local write performance and short-term storage; the central S3 lake provides durability, governance, and analytics at scale.

Misconceptions / Traps
  • One-way replication simplifies conflict resolution (no write conflicts) but introduces data loss risk if edge nodes fail before replication completes. Monitor replication lag and edge node health.
  • Bandwidth between edge and core is often constrained. Compression, deduplication, and priority-based replication are important for WAN-limited edge sites.
Key Connections
  • scoped_to Geo / Edge Object Storage — the ingestion pattern for edge-collected data
  • depends_on S3 API — replication uses S3-compatible protocols
  • constrained_by Egress Cost — WAN transfer costs for edge-to-core replication

Definition

What it is

A one-way replication pattern from many edge S3-compatible object stores to a central S3 data lake, consolidating distributed data for centralized analytics and archival.

Why it exists

Edge locations (factories, retail stores, remote offices) generate data that must eventually reach a central data lake for analytics, compliance, and long-term retention. Edge-to-core aggregation automates this flow without requiring real-time connectivity.

Primary use cases

IoT data aggregation, retail point-of-sale data collection, remote office backup to central S3, distributed log aggregation.

Recent developments

Latest signals
  • AWS IoT Greengrass v2.17 (April 2026) — non-root install + lightweight components. Edge runtime now runs as non-root on Linux + new lightweight components use significantly less memory. Important for the resource-constrained sensors + gateways where Greengrass historically struggled. Per AWS — IoT Greengrass v2.17 announcement.
  • Greengrass local filter + aggregate + send is the canonical edge-to-S3 pipeline. AWS Greengrass collects + aggregates + filters at the edge, then ships to S3 — the reference architecture for "many edges → one lake" deployments. Reduces upstream bandwidth by 10-100× depending on the filtering. Per AWS — IoT Greengrass and AWS Docs — What is IoT Greengrass.
  • AWS Snow Family supports Greengrass + Lambda + EC2 at the edge. Snow devices can host Greengrass + Lambda + EC2 locally — edge-to-core aggregation where the "edge" is itself a compute platform. Disconnected/intermittent-connectivity scenarios where Snow doubles as the local cache before the eventual S3 sync. Per SupportPro — AWS Snow Family Guide.
  • AWS Connected Edge Intelligence positions edge-to-core as an AI pattern. AWS's 2026 framing: edge intelligence pre-processes raw sensor data into features + embeddings before sending to S3 — the edge becomes part of the AI data pipeline, not just a forwarder. Per AWS IoT Blog — Unlocking Real-Time Intelligence at the Edge.
  • Vendor ecosystem: Edge Impulse + Wevolver + others standardize the gateway architecture. Edge Impulse's Greengrass deployment integration + Wevolver's IoT-gateway protocol-translation patterns crystallize 2026's reference architectures. Pattern: protocol translation at edge → aggregation → batched S3 upload. Per Edge Impulse — AWS IoT Greengrass Deployment Integration and Wevolver — IoT Gateway Architecture: Edge vs Cloud, Protocol Translation, Deployment.
  • S3 as the canonical edge-data persistence sink. Greengrass docs ship the S3 write path as a first-class pattern — generated data lands in S3, ensuring persistence + enabling downstream analytics + archival. No bespoke "edge-data sink" architecture required. Per VividCloud — Empowering IoT with AWS Greengrass and Edge Computing.

Connections 4

Outbound 4

Resources 2