TalksAWS re:Invent 2025 - From Legacy to Lakehouse: Proven Best Practices for Data Warehouse Migration
AWS re:Invent 2025 - From Legacy to Lakehouse: Proven Best Practices for Data Warehouse Migration
Migrating Legacy Data Warehouses to a Lakehouse Architecture
Challenges with Legacy Data Warehouses
Legacy on-premises data warehouses can be expensive, cost-prohibitive, and limit future business opportunities
Key challenges include scale, cost, openness, AI/ML readiness, and complex ETL/data preparation
Benefits of a Lakehouse Architecture
Enables openness, portability, and integration with other systems and workloads
Supports a wide range of analytics, BI, ETL, and AI/ML use cases on a single platform
Provides end-to-end data governance and security
Four-Step Approach to Successful Migration
Engage a migration partner to scope, plan, and execute the project
Define the target payback period and migration strategy (lift-and-shift vs. modernization)
Assess the current environment and convert/migrate the data
Repoint applications to the new data warehouse and continuously optimize
The Role of Migration Tooling
Migration tooling typically accounts for 20% of the overall migration effort
Key capabilities include:
Assessing and profiling the current environment
Translating SQL queries to ANSI SQL standards
Automating data migration and validation
Databricks SQL and the Data Intelligence Platform
Databricks SQL is an open, intelligent data warehouse that sits within the broader Data Intelligence Platform
Provides tight integration with BI, ETL, AI/ML, governance, and other capabilities
Used by 60% of the Fortune 500 for core data workloads
Four Pillars of Modernization
Cost Optimization: Maximize TCO savings by tracking granular cost metrics
Openness and No Lock-in: Ensure data portability, federation, and multi-cloud capabilities
Governance: Implement end-to-end data governance and security
Speed and Predictability: Migrate quickly and with minimal risk
Customer Migration Example
57% TCO reduction over 5 years
12-18 month payback period
50% faster data delivery performance
Data Mesh Architecture
Establishes a "hub and spoke" model with centralized governance and domain-specific data products
Empowers business domains to manage their own data while maintaining overall data integrity
Enables self-service, automation, and frictionless user experience
Key Lessons Learned
Dedicate a focused team for the migration project
Run a hybrid phase with parallel legacy and new systems
Strengthen data governance as you transition to a data mesh model
Start small with high-value domains to prove the impact
Manage the platform as a product, not just a tool
Conclusion
By migrating from legacy data warehouses to a modern lakehouse architecture, organizations can unlock significant cost savings, improve data openness and governance, and enable advanced analytics and AI/ML capabilities. The key is to carefully plan the migration, leverage the right tooling, and establish a data mesh framework to empower business domains while maintaining centralized control.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.