Architecture

Row / Column Security

The practice of restricting access to specific rows or columns within lakehouse tables based on user identity, role, or policy, enforced at query time by the compute engine or catalog layer.

8 connections 3 resources

Summary

What it is

The practice of restricting access to specific rows or columns within lakehouse tables based on user identity, role, or policy, enforced at query time by the compute engine or catalog layer.

Where it fits

Row/column security is the fine-grained access control layer for multi-tenant or regulated lakehouses on S3. Since S3 itself has only bucket and prefix-level IAM policies, row/column security must be enforced by the query engine (Trino, Spark, Dremio) or catalog (Polaris, Ranger) rather than by the storage layer.

Misconceptions / Traps

S3 cannot enforce row or column-level access. Security policies must be enforced at the query engine or catalog layer, and any tool with direct S3 access can bypass them.
Row-level security applied at query time adds runtime overhead. Filter predicates must be injected into every query plan, and complex policies can degrade performance.
Column masking and row filtering are not always composable. Interactions between row filters and column masks can produce unexpected results if not carefully tested.

Key Connections

scoped_to Lakehouse, S3 — access control for S3-stored table data
depends_on Apache Ranger — policy engine for row/column security
enables Tenant Isolation — row-level filtering is a key tenant isolation mechanism
enables Compliance-Aware Architectures — regulatory requirement for data access control

Definition

What it is

An access control pattern that restricts visibility of specific rows or columns in lakehouse tables based on user identity, role, or policy — enforced at query time without duplicating data on S3.

Why it exists

S3 stores data as files with bucket/key-level access control. Fine-grained security (hiding salary columns from non-HR users, filtering rows by region) cannot be achieved at the storage layer alone. Row/column security adds a logical access control layer above the physical files.

Primary use cases

Multi-tenant data access on shared lakehouse tables, GDPR-compliant column masking, role-based row filtering for analytics.

Recent developments

Latest signals

Apache Polaris graduated to ASF Top-Level Project (February 18, 2026). Polaris ships fine-grained access control with SQL-based row/column-security policies — Iceberg's open-standard governance answer now has the same ASF top-level status as Iceberg itself. Per Estuary — Iceberg Catalog Showdown: Apache Polaris vs Unity Catalog.
Snowflake Horizon + Polaris API: row filters + column masking enforced cross-engine. Horizon intercepts Open Catalog REST API calls + down-scopes Iceberg metadata returned to external query engines based on user role/permissions. Means policies defined in Snowflake apply when Spark/Trino/Dremio reads the same tables. Per DataLakehouseHub — Choosing the Right Iceberg Control Plane: Polaris vs Unity Catalog vs Cloud REST (May 2026).
Unity Catalog ABAC extends row/column policies cross-engine starting with Spark. Define row filters + column masks once in Unity Catalog ABAC; policies apply automatically across Databricks + Spark (now), with further expansion planned. The "policy travels with the data" pattern that closes the cross-engine governance gap. Per Databricks — Discover ABAC with Unity Catalog.
Cross-format unification: Unity Catalog covers both Delta + Iceberg with one security model. Databricks 2026 framing: single, scalable security model for Delta + Iceberg — eliminates format silos for governance. Practitioners no longer maintain two parallel policy stacks. Per Databricks Blog — Completing the Lakehouse Vision: Open Storage + Open Access + Unified Governance.
Critical limitation: Iceberg REST + Unity REST APIs can't access tables with row filters or column masks. Known constraint as of 2026 — the REST API path bypasses the row/column-security enforcement layer; tools that use REST APIs see unfiltered data. Practitioners must enforce at the query-engine layer for those code paths. Per Atlan — 5 Unity Catalog Limitations to Overcome in 2026.
Apache Ranger remains the open-source incumbent for non-Databricks shops. Ranger is still the default fine-grained policy engine for Spark/Trino/Hive on-prem deployments — open-source, format-neutral, broad engine integration. The Polaris + Ranger competitive landscape is shaping. Per Databricks Docs — Row Filters + Column Masks.