TalksAWS re:Invent 2025 - Seamless data sharing in Amazon Redshift (ANT348)
AWS re:Invent 2025 - Seamless data sharing in Amazon Redshift (ANT348)
Seamless Data Sharing in Amazon Redshift: Improving Agility and Governance
Introduction to Amazon Redshift Data Sharing
Amazon Redshift data sharing is a mature capability that allows organizations to share live data securely and easily across Redshift warehouses.
It enables data sharing within the same AWS account, across different accounts, and even across regions, for both read and write workloads.
Data sharing eliminates the need for manual data copying and movement, complex ETL pipelines, and the resulting data staleness and governance challenges.
Technical Overview of Redshift Data Sharing Architecture
Redshift's storage and compute isolation powers the data sharing architecture.
Multiple producer warehouses can share data through Redshift-managed storage to multiple consumer warehouses.
Consumers can also share their data with other warehouses, enabling a flexible, multi-directional data sharing topology.
Users and applications can access the most up-to-date and consistent data without the need for manual data copying.
Data Share Management Options
Redshift-Managed Data Shares: Data shares are created and permissions managed within the producer Redshift warehouse, then shared directly with consumer warehouses.
Lake Formation-Managed Data Shares: Data shares are created in the producer warehouse but permissions are centrally managed through AWS Lake Formation.
AWS Data Exchange for Amazon Redshift: Customers can share their data as a product through the AWS Data Exchange, allowing subscribers to access the data.
Common Use Cases for Redshift Data Sharing
Workload Isolation: Splitting a monolithic Redshift warehouse into specialized, purpose-built warehouses for different workloads (ETL, BI, data science, etc.), connected via data sharing.
Example: Peloton implemented this to save $300,000 annually.
Cross-Group Collaboration: Enabling seamless data access and sharing across business teams (finance, sales, marketing, R&D, etc.) for broader analytics and data science.
Example: Fenmai built a central data marketplace using Redshift data sharing.
Delivering Data as a Service: Securely sharing live data with internal and external parties as a data product, with usage monitoring and control.
Sharing Data between Environments: Enabling agile data access between development, test, and production Redshift environments without the need for data copying.
Enhancing Governance with AWS Lake Formation
For complex data sharing topologies with multiple producers and consumers, Lake Formation can simplify permission management by centralizing access controls.
Lake Formation allows centralized definition and enforcement of data sharing policies, eliminating the need for manual scripting and complex coordination.
Best Practices and Considerations
Pricing: Producer warehouses are charged for data storage, while consumers pay for compute usage to access shared data.
Cross-region data sharing incurs additional data transfer fees.
Query performance depends on the consumer cluster's compute capacity, not the producer's.
Redshift supports concurrency scaling to handle unpredictable workloads.
Data is encrypted in transit and at rest, with options for separate KMS keys for producers and consumers.
Redshift system tables can be used to audit data share usage and changes.
Getting Started with Redshift Data Sharing
The presentation provides a QR code linking to resources, including blog posts and best practice documents, to help customers get started with Redshift data sharing.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.