TalksAWS re:Invent 2025 - Maximizing block storage performance for high-intensity workloads (STG319)

AWS re:Invent 2025 - Maximizing block storage performance for high-intensity workloads (STG319)

Maximizing Block Storage Performance for High-Intensity Workloads

Overview

  • Presentation by Mark Olsen, Senior Principal Engineer on the EBS team, and Jody, EBS Product Manager
  • Focuses on optimizing Amazon Elastic Block Store (EBS) performance for mission-critical applications
  • Discusses planning, testing, and monitoring strategies to ensure consistent, high-performance storage

Planning Your Infrastructure

  • Understand your application's workload requirements
    • For a transactional database, use high-performance IO2 volumes with 256K IOPS and 59-9s durability
    • Leverage different volume types (IO2, GP3) for different components (e.g. database journals, Kafka topics)
  • Leverage larger, faster GP3 volumes
    • Increased max IOPS from 16K to 80K, throughput from 1,000 to 2,000 MB/s, and max size from 16 TB to 64 TB
  • Understand performance characteristics of volume types
    • GP3: Designed for single-digit millisecond latencies 99% of the time
    • IO2: Designed for sub-millisecond latencies 99.9% of the time, with 10x fewer latency outliers than GP3

Testing and Benchmarking

  • Use flexible tools like FIO for raw I/O testing
  • Leverage industry-standard benchmarks like TPC-C to simulate realistic workloads
  • Perform end-to-end load testing on your application to uncover bottlenecks
  • Use AWS Fault Injection Service (FIS) to test resilience under controlled chaos
    • New EBS-specific actions to inject latency and simulate degraded performance

Monitoring and Troubleshooting

  • Analyze instance status checks to identify infrastructure vs. application issues
    • Check for problems with the instance, EBS attachment, or instance liveness
  • Leverage new EBS performance metrics for deeper visibility
    • Average IOPS, throughput, and latency histograms per volume
  • Understand how burst capacity works on instance types like M5 and how to size instances appropriately
  • Use elastic volumes to change volume type, size, and performance characteristics on the fly
    • Caveats around latency impact during optimization

The Evolution of EBS Performance

  • EBS has continuously improved in performance and capabilities over the years
  • Nitro-based instances provide hardware offloads and a custom SRD network protocol for lower latency and higher throughput
  • New R5b instances offer up to 720K IOPS and 150 Gbps of EBS bandwidth

Business Impact

  • Consistent, high-performance storage is critical for mission-critical applications like medical device manufacturing and healthcare software
  • Proper planning, testing, and monitoring can help ensure resilience, compliance, and optimal performance
  • Leveraging the latest EBS features and instance types can provide a significant boost in throughput and IOPS for demanding workloads

Key Takeaways

  • Understand your application's storage requirements and choose the right EBS volume types
  • Thoroughly test your application's performance and resilience using tools like FIO, industry benchmarks, and AWS FIS
  • Monitor EBS performance metrics closely and leverage elastic volumes to adapt to changing needs
  • Take advantage of the latest EBS and EC2 instance advancements for maximum performance

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.