Unifying data governance with Immuta and AWS Lake Formation (DAT205)

Unifying Data Governance with Muta and AWS Lake Formation

Overview

  • The speaker, Zack Fredman, will discuss the topic of unifying data governance with Muta and AWS Lake Formation.
  • The presentation aims to provide insights into how access controls work in cloud data warehouses and data lakes, and how different platforms handle security models.
  • The speaker will also explore the challenges and solutions for managing access controls across heterogeneous data environments.

Access Controls in Cloud Data Platforms

  • Different platforms have varying approaches to access controls:
    • Snowflake: Supports table-level, row-level, and column-level security through role-based access control, row access policies, and column masking policies.
    • Databricks Unity Catalog: Offers user-based access controls, similar to Snowflake's approach, but does not yet have role inheritance.
    • Amazon Redshift: Provides table-level security through access privileges, row-level security through RLS policies, and column-level security through column-level privileges and CLS policies.
    • AWS Lake Formation: Offers table-level security through LF grants to IAM principals and LF TBAC (Tag-Based Access Control), row-level security through data cell filters, and column-level security through column-level permissions and data cell filters.

Challenges with Heterogeneous Data Environments

  • Open Table Format (OTF) catalogs, such as Iceberg, REST Catalog, and Unity Catalog, aim to provide centralized metadata management and enable interoperability across platforms.
  • However, these catalogs do not yet have full parity with the access control capabilities of individual platforms:
    • Table-level security is generally service account-based RBAC.
    • Row-level security is lacking, but there are some spec proposals.
    • Column-level security has some "hacks" like virtual columns, and there are also spec proposals.

Unified Access Control Solution

  • Platforms that support OTF tables typically extend their security models to support OTF tables in the same way as their internal tables.
  • However, this still requires supporting the security models of each individual platform, unless using only open-source platforms like Spark, Flink, and Trino.
  • To address this challenge, a policy orchestration engine, like Muta, can be used to unify access controls across different platforms.
    • Muta becomes the security model, handling the translation and enforcement of policies across the underlying platforms.
    • Policies can be defined and managed in Muta, and the engine will push the policies to the respective security models of the platforms.
    • This allows for unified access controls, regardless of the data storage locations or the query engines used.

Conclusion

  • Unifying data governance across heterogeneous data environments can be challenging due to the varying security models of different platforms.
  • Open Table Format catalogs aim to provide centralized metadata management and interoperability, but do not yet have full parity with platform-specific access control capabilities.
  • A policy orchestration engine, like Muta, can help bridge this gap by providing a unified security model and orchestrating the enforcement of access controls across the underlying data platforms.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us