TalksAWS re:Invent 2025 - Warner Music Group: Apache Iceberg on AWS & Snowflake (ANT205)
AWS re:Invent 2025 - Warner Music Group: Apache Iceberg on AWS & Snowflake (ANT205)
Leveraging Apache Iceberg to Enable Interoperability Across Data Platforms at Warner Music Group
Overview
This presentation discusses how Warner Music Group (WMG), a major music conglomerate, has leveraged Apache Iceberg to build a unified, interoperable data architecture across their complex, multi-engine, and decentralized data environment. The key focus is on how Iceberg has enabled WMG to seamlessly integrate data and analytics capabilities across Snowflake, Databricks, and other legacy systems.
Challenges in WMG's Data Landscape
Highly decentralized and distributed data environment with numerous labels and sub-labels
Lack of standardization leading to a multi-engine, multi-vendor landscape
Hundreds of data pipelines and BI/analytics tools spread across Snowflake and Databricks
Need to share data with third-party integrations and receive data from external sources
Adopting Apache Iceberg as the Unifying Data Layer
Iceberg provides a data lake table format that enables a consistent, platform-agnostic way to access data
Key features leveraged by WMG include partition evolution, time travel, and flexibility to work with data across systems
Iceberg acts as the "connective tissue" between WMG's Snowflake-based data lake and Databricks-based data mesh
Technical Implementation and Integrations
Opus Case Study:
Opus is WMG's most widely used internal tool, with 700+ users accessing data directly from Snowflake
To integrate data from Databricks' data mesh, WMG leveraged Iceberg's federated catalog capabilities to enable seamless data access across platforms
This approach allowed WMG to bring Snowflake to the data in Databricks, rather than moving data, resulting in a 10% performance impact
Expanding Iceberg Usage:
WMG is increasingly using Snowflake's native Iceberg support to publish data as Iceberg tables, making it accessible to other engines
This enables efficient data sharing with third-party integrations and other parts of the business without the need for complex ETL processes
Key Learnings and Outcomes
Learnings:
Snowflake integration required custom configurations, such as optimizing partition file sizes
Databricks' Uniform Iceberg integration had limitations around deletion vectors and multi-table transactions
Outcomes:
70% reduction in EC2 usage for data management tasks
Established a single, consistent data layer and source of truth, improving data trust and quality
Faster time-to-insight by leveraging the unified Iceberg-based data layer
Enabled a future-ready, open ecosystem for leveraging data across AI/ML and other emerging use cases
Conclusion
By adopting Apache Iceberg as the unifying data layer, WMG has been able to overcome the challenges of their complex, decentralized data landscape and achieve a high degree of interoperability across their Snowflake, Databricks, and other data platforms. This has resulted in significant improvements in efficiency, data quality, and the ability to rapidly deliver insights to support their music business.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.