Talks AWS re:Invent 2025 - Amazon S3 performance: Architecture, design, and optimization (STG335) VIDEO
AWS re:Invent 2025 - Amazon S3 performance: Architecture, design, and optimization (STG335) Optimizing Amazon S3 Performance: Architecture, Design, and Strategies
S3 Architecture and Scale
S3 is composed of three main components:
Front-end services that handle requests and orchestrate processing
Index that maps object metadata to storage locations
Storage subsystem that manages disks and data placement
S3 currently stores over 500 trillion objects, equating to hundreds of exabytes of storage
S3 serves over 200 million requests per second, with customers running over 1 million data lakes on AWS
Driving High Throughput on S3
Leverage S3's scale by parallelizing connections across multiple IP addresses
Use multipart uploads to split large objects into smaller parts and upload in parallel
Improves throughput and recovery time if individual connections fail
Allows starting uploads before having all data in memory
Use range gets to download large objects in parallel across multiple connections
Reduces download time from 5 seconds to 1 second in the example
Architectural Patterns for High Performance
Use the AWS Common Runtime (CRT) client library to automatically handle connection management, parallelization, retries, and more
Provides a "target throughput" configuration to automatically scale connections
Available in SDKs like Java, C++, Python, and the AWS CLI
Leverage S3's prefix-based partitioning to scale throughput
Each prefix can handle up to 3,500 puts or 5,500 gets per second
Strategically structure prefixes to avoid "sharp edges" when data patterns change over time
S3 Express One Zone
S3 Express One Zone is a high-performance storage class offering:
Single-digit millisecond access times
Up to 2 million requests per second per directory bucket
Ability to append data to existing objects
Constant-time object renaming
Key use cases:
Machine learning training
Interactive querying and analytics
Log and media streaming
Model loading for inference pipelines
Architectural considerations:
Single availability zone storage
Directory buckets with pre-scaled throughput
Session-based authentication for low-latency access
Key Takeaways
Leverage S3's scale and parallelism to achieve high throughput and low latency
Use the AWS CRT client library to simplify performance optimization
Structure prefix hierarchies strategically to avoid throughput bottlenecks
Consider S3 Express One Zone for latency-sensitive, bursty workloads
Choose storage classes and architectural patterns based on specific performance requirements
Your Digital Journey deserves a great story. Build one with us.