Data mesh at Moderna: One dbt to unify data and people (DAT206)

Summarizing the Video Transcription

Introduction to DBT

  • DBT (Data Build Tool) is an opinionated workflow for developing data transformations, based on the software development life cycle.
  • DBT provides active metadata, which means as you work with DBT, you produce metadata about the state of your warehouse and transformation DAG (Directed Acyclic Graph).
  • This metadata layer enables the creation of a data control plane, where you can build, deploy, orchestrate, observe, and catalog your entire analytics stack through DBT.
  • DBT allows you to govern your data, develop transformations more rapidly and with more confidence, resulting in better-governed data pipelines.

Challenges with Multiple Data Warehouses

  • Many enterprises use multiple data warehouses for various reasons, such as acquisitions or the need for different tools for specific use cases.
  • This can lead to the problem of having to write custom scripts and build pipelines to integrate these different platforms, resulting in a loss of the benefits of governance across the entire data pipeline.

Introducing Cross-Platform DBT Mesh

  • DBT Labs is working on a solution called Cross-Platform DBT Mesh, which allows you to connect your DBT DAG across multiple data platforms, while preserving governance, speed of development, and other benefits of using DBT.

Modana's Journey with DBT and Cross-Platform DBT Mesh

Key Challenges at Modana

  1. Data accessibility and availability: Ensuring data is accessible and securely shared across business functions.
  2. Data governance: Securely governing data and adhering to regulatory and compliance requirements.
  3. Scalable infrastructure: Enabling various use cases (analytics, data science, etc.) and scaling the data platform.

How DBT Helped Modana

  1. Data Domain and Ownership: Modana transitioned from a centralized data team to a data domain-based approach, where data engineers and business stakeholders collaborate on specific business domains.
  2. Data as a Product: Modana utilized DBT's data mesh framework to build curated data products from data in the data lake and data warehouse, enabling self-service data and analytics.
  3. Compliance and Governance: Modana used DBT to enforce data quality, metadata, and access controls, helping streamline governance and compliance.
  4. Pipeline and Data Product Maintenance: DBT helped Modana maintain data pipelines and data products, providing end-to-end lineage and cataloging.

A Use Case: Supply Chain Management

  • Modana needed to build a supply chain KPI dashboard, combining data from the data lake (supply chain and shipment data) and data warehouse (manufacturing data).
  • Without DBT Mesh, this would have required duplicating data and breaking lineage.
  • With DBT Mesh, Modana was able to create a single project that referenced models from the Athena (data lake) and Redshift (data warehouse) projects, preserving lineage and streamlining the data engineering effort.

Conclusion

  1. Data platform is the foundation for any organization's data success.
  2. Strong data governance and security are key for data projects.
  3. Scalable infrastructure, while keeping costs under control, is crucial for data project success.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us