TalksAWS re:Invent 2025 - Amazon S3 performance: Architecture, design, and optimization (STG335)

AWS re:Invent 2025 - Amazon S3 performance: Architecture, design, and optimization (STG335)

Optimizing Amazon S3 Performance: Architecture, Design, and Strategies

S3 Architecture and Scale

  • S3 is composed of three main components:
    • Front-end services that handle requests and orchestrate processing
    • Index that maps object metadata to storage locations
    • Storage subsystem that manages disks and data placement
  • S3 currently stores over 500 trillion objects, equating to hundreds of exabytes of storage
  • S3 serves over 200 million requests per second, with customers running over 1 million data lakes on AWS

Driving High Throughput on S3

  • Leverage S3's scale by parallelizing connections across multiple IP addresses
  • Use multipart uploads to split large objects into smaller parts and upload in parallel
    • Improves throughput and recovery time if individual connections fail
    • Allows starting uploads before having all data in memory
  • Use range gets to download large objects in parallel across multiple connections
    • Reduces download time from 5 seconds to 1 second in the example

Architectural Patterns for High Performance

  • Use the AWS Common Runtime (CRT) client library to automatically handle connection management, parallelization, retries, and more
    • Provides a "target throughput" configuration to automatically scale connections
    • Available in SDKs like Java, C++, Python, and the AWS CLI
  • Leverage S3's prefix-based partitioning to scale throughput
    • Each prefix can handle up to 3,500 puts or 5,500 gets per second
    • Strategically structure prefixes to avoid "sharp edges" when data patterns change over time

S3 Express One Zone

  • S3 Express One Zone is a high-performance storage class offering:
    • Single-digit millisecond access times
    • Up to 2 million requests per second per directory bucket
    • Ability to append data to existing objects
    • Constant-time object renaming
  • Key use cases:
    • Machine learning training
    • Interactive querying and analytics
    • Log and media streaming
    • Model loading for inference pipelines
  • Architectural considerations:
    • Single availability zone storage
    • Directory buckets with pre-scaled throughput
    • Session-based authentication for low-latency access

Key Takeaways

  • Leverage S3's scale and parallelism to achieve high throughput and low latency
  • Use the AWS CRT client library to simplify performance optimization
  • Structure prefix hierarchies strategically to avoid throughput bottlenecks
  • Consider S3 Express One Zone for latency-sensitive, bursty workloads
  • Choose storage classes and architectural patterns based on specific performance requirements

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.