AWS re:Invent 2025 - Amazon S3 Tables architecture, use cases, and best practices (STG334)

Summary of AWS re:Invent 2025 - Amazon S3 Tables Architecture, Use Cases, and Best Practices

Overview of Amazon S3 Tables

Amazon S3 Tables is a fully managed service that allows customers to create and manage Apache Iceberg tables directly in Amazon S3

Key benefits include optimized performance and scale, simplified security controls, and automatic table maintenance

S3 Tables is tightly integrated with other AWS services like AWS Glue Data Catalog and provides open access via the Iceberg REST Catalog interface

Recent Launches and Updates

Over the past year, AWS has shipped numerous features to make S3 Tables more flexible and optimized for data lake environments:

Advanced compaction techniques like sort and zorder for improved performance and cost-effectiveness
Expanded to 32 AWS regions and scaled up to 100,000 tables per region
Added table-level encryption using KMS and resource tags for attribute-based access control
Enabled direct access to Athena and SageMaker Unified Studio from the S3 console
Integrated with partner tools using the Iceberg REST Catalog interface

New Capabilities

Iceberg v3 Support:

Adds support for deletion vectors and row lineage, enabling more efficient data modifications and change tracking
Available through services like SageMaker Unified Studio's notebook interface

Intelligent Tiering for S3 Tables:

Automatically transitions table data across S3's frequent, infrequent, and archive access tiers based on access patterns
Optimizes storage costs by up to 80% without impacting performance
Compaction is now tier-aware, focusing on optimizing the most actively queried data

S3 Tables Replication:

Enables replicating Iceberg tables across AWS regions for improved performance, compliance, and data protection
Automatically mirrors table and namespace resources, replicates data and metadata, and maintains snapshot history
Provides built-in audit trails, real-time monitoring, and the ability to configure replica settings independently

Customer Examples

Zeta Global reduced data freshness latency by 80% and compressed time-to-insights from 15 minutes to a few minutes by using S3 Tables for their petabyte-scale data lake.

Indeed is migrating their 85PB data lake to S3 Tables, streamlining their data infrastructure and reducing costs. The migration has unlocked significant business value:

75% faster reporting for the Heron Insights team
65% cost reduction for the Smart Sourcing team
88% reduction in complexity for the Indeed Interviews team
98% improvement in SLOs for the Partner Analytics team

Best Practices and Recommendations

Iceberg Partitioning:

Choose partitioning schemes (time-based or hash-based) that align with your primary query patterns for optimal performance
Iceberg allows changing partitioning schemes over time as data and requirements evolve

Compaction:

Keep the default "auto" compaction mode, which will use sort if a sort order is defined or bin-pack otherwise
Consider using zorder compaction if your queries filter on multiple columns

Snapshot Management:

For batch ETL workloads, keep the default 3-day maximum snapshot age
For streaming workloads, reduce the maximum snapshot age to 24 hours or less to avoid performance degradation from large metadata files

Unreference File Removal:

Generally, keep the default 3-day setting for unreference file removal

Developing Applications with S3 Tables

Demonstrated a web application built using React, Amazon Bedrock, and the DuckDB WebAssembly client to enable natural language querying of S3 Tables

Allows performing complex analytical queries on customer and order data without writing SQL

Highlights the ease of building serverless, browser-based applications that leverage the Iceberg compatibility of S3 Tables

Key Takeaways

S3 Tables simplifies the management of petabyte-scale data lakes by providing a fully managed Iceberg storage service

Recent launches like Iceberg v3 support, Intelligent Tiering, and S3 Tables Replication unlock significant performance, cost, and operational benefits

Customers like Zeta Global and Indeed have seen transformative business impact by migrating to S3 Tables

Best practices around partitioning, compaction, snapshot management, and file removal can help optimize S3 Tables deployments

The ability to build natural language-driven, serverless applications on top of S3 Tables showcases the platform's developer-friendly capabilities

Additional Resources

S3 Tables tutorial: [link]

S3 Tables workshop: [link]

SageMaker Unified Studio integration: [link]

AWS re:Invent 2025 - Amazon S3 Tables architecture, use cases, and best practices (STG334)

Summary of AWS re:Invent 2025 - Amazon S3 Tables Architecture, Use Cases, and Best Practices

Overview of Amazon S3 Tables

Recent Launches and Updates

New Capabilities

Customer Examples

Best Practices and Recommendations

Developing Applications with S3 Tables

Key Takeaways

Additional Resources

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Amazon S3 Tables architecture, use cases, and best practices (STG334)

Summary of AWS re:Invent 2025 - Amazon S3 Tables Architecture, Use Cases, and Best Practices

Overview of Amazon S3 Tables

Recent Launches and Updates

New Capabilities

Customer Examples

Best Practices and Recommendations

Developing Applications with S3 Tables

Key Takeaways

Additional Resources

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.