TalksAWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

Scaling Amazon Redshift with a Multi-Warehouse Architecture

Overview of Multi-Warehouse Architectures

Addresses challenges of workload interference and resource contention in monolithic data warehouse architectures
Introduces two key design patterns:
1. Hub and Spoke: Separate compute clusters for different workloads (e.g. streaming, batch, analytics, data science) with a centralized data store
2. Data Mesh: Separate compute clusters and data ownership for different business units/teams, with controlled data sharing

Key Features of Multi-Warehouse Architectures

Redshift Managed Storage and Compute

Redshift Managed Storage provides highly optimized columnar storage for analytics
Hybrid compute model using a mix of provisioned and serverless Redshift clusters
Ability to mix and match provisioned and serverless clusters based on workload needs

Federated Permissions Management

Centralized management of fine-grained access control policies across multiple Redshift clusters
Policies tied to user identity and data sovereignty requirements

Integration with Data Lake and Ecosystem

Ability to query data in Redshift and open table formats like Apache Iceberg
2x performance improvements for Iceberg queries using Redshift Serverless
Iceberg write support for append-only workloads
Integration with SageMaker Unified Studio for end-to-end data and AI workflows

AI-Powered Use Cases

Natural language querying of Redshift data using Amazon Bedrock
Embedding Redshift data as knowledge bases for generative AI applications
Integration with AWS MCP (Model Context Protocol) for AI orchestration

Vanguard's Journey with Multi-Warehouse Architectures

Started with a centralized data warehouse on Redshift, unlocking BI and analytics use cases
Faced challenges with resource contention, workload management complexity, and scaling as data and use cases grew
Transitioned to a multi-warehouse "hub and spoke" architecture:
- Separate Redshift clusters for ETL, analytics, and data science workloads
- Improved SLAs, analyst experience, and workload isolation
Moving towards a "data mesh" architecture:
- Separate data ownership and compute for different business domains
- Leveraging Iceberg tables and Redshift Serverless for increased agility

Key Lessons and Best Practices

Start simple and gradually evolve the architecture as needs grow
Collaborate with AWS solution architects to identify and adopt new features
Track key metrics like active users, storage, costs, and query performance
Embrace a flexible, multi-layered architecture to meet diverse and evolving business requirements

Technical Details and Metrics

Vanguard's data landscape:
- 20TB in Redshift Managed Storage
- 150TB in S3 data lake
- 600 tables, 400 views, 100 active users
- 500,000+ queries per month, powered by thousands of batch jobs
Redshift Serverless instances used for workload isolation and improved performance
Apache Iceberg used as the open table format for the data lake

Business Impact

Enabled new use cases like comparative product analysis, leading to new product offerings
Improved analyst experience and self-service analytics capabilities
Increased agility and reduced coordination overhead through the data mesh architecture
Ensured business-critical workloads (ETL, reporting) are isolated from ad-hoc queries and AI use cases

Example Use Cases

Ingesting real-time sales data from an Oracle database using Zero ETL integration
Combining data from the data warehouse and data lake (Iceberg) for reporting
Exposing Redshift data as a knowledge base for generative AI applications using Amazon Bedrock

Your Digital Journey deserves a great story.

Build one with us.

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

Scaling Amazon Redshift with a Multi-Warehouse Architecture

Overview of Multi-Warehouse Architectures

Key Features of Multi-Warehouse Architectures

Redshift Managed Storage and Compute

Federated Permissions Management

Integration with Data Lake and Ecosystem

AI-Powered Use Cases

Vanguard's Journey with Multi-Warehouse Architectures

Key Lessons and Best Practices

Technical Details and Metrics

Business Impact

Example Use Cases

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

Scaling Amazon Redshift with a Multi-Warehouse Architecture

Overview of Multi-Warehouse Architectures

Key Features of Multi-Warehouse Architectures

Redshift Managed Storage and Compute

Federated Permissions Management

Integration with Data Lake and Ecosystem

AI-Powered Use Cases

Vanguard's Journey with Multi-Warehouse Architectures

Key Lessons and Best Practices

Technical Details and Metrics

Business Impact

Example Use Cases

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.