Scaling Prime Video for peak NFL streaming on AWS (ARC311)

Here is a detailed summary of the video transcription in Markdown format, with the key takeaways organized into sections:

Prime Video's Strategies for Scaling NFL Thursday Night Football on AWS

Challenges and Opportunities

  • The primary challenge was delivering a seamless streaming experience to millions of viewers during the highly variable and spiky traffic patterns of NFL Thursday Night Football games.
  • The peak viewership surged from 10 million in 2022 to 18 million in 2024, putting tremendous pressure on Prime Video's infrastructure.
  • The need to scale infrastructure cost-effectively to meet this demand while maintaining high availability was the key business opportunity.

Multi-Region Architecture

  • Prime Video adopted a multi-region architecture to enhance reliability and resilience:
    • Instance type flexibility: Using a diverse set of instance types to increase the available capacity pool.
    • AZ flexibility: Leveraging multiple Availability Zones within a region for fault tolerance.
    • Multi-region flexibility: Extending the architecture across multiple AWS Regions to handle regional outages and optimize for latency.
  • This multi-region approach was implemented in three key areas of the Prime Video stack:
    1. Signal Delivery: The live signal ingestion stack was built from the ground up to support multi-region.
    2. Playback: The playback stack was globalized to enable regional failover and consistent user experience.
    3. Application Storefront: The storefront stack was also migrated to a multi-region architecture, including data replication and globalization.

Elastic Scaling

  • The unique challenges for scaling Prime Video's infrastructure for live sports events include:
    • Highly variable "peak-to-mean" ratio, with the NFL Thursday Night Football games causing orders of magnitude spikes in traffic.
    • Spiky traffic patterns within the games, with sharp increases at kickoff and halftime.
    • The need for coordination across hundreds of distributed service teams to scale up and down.
  • To address these challenges, Prime Video built a centralized auto-scaling solution that:
    1. Leverages a forecasting system to predict demand and optimize capacity planning.
    2. Provides a central hub to route scaling signals and manage the auto-scaling process.
    3. Embeds a transformation library within each service team to enable automated, service-specific scaling.

Well-Architected Framework

  • The key design principles Prime Video applied to ensure reliability and resilience included:
    • Automatic recovery from failures
    • Testing beyond destruction to validate recovery procedures
    • Horizontal scaling to increase aggregate workload availability
    • Stopping "guessing" of capacity and leveraging automation
  • Other best practices included:
    • Understanding data consistency, availability, and partition tolerance trade-offs
    • Identifying and managing dependencies, both internal and external
    • Ensuring operational readiness through automated recovery procedures and cross-region health checks

Key Outcomes

  • Achieved operational resilience to support the growth in millions of users during NFL Thursday Night Football games.
  • Enabled dynamic scaling to handle the highly variable "peak-to-mean" ratio, with a 20% reduction in operational costs.
  • Realized a 43% reduction in carbon footprint by optimizing the number of EC2 instances and adopting Graviton.

The overarching principle guiding Prime Video's approach was the understanding that "everything fails all the time," and building a system that embraces failure as a natural occurrence through multi-region architectures and automated scaling.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us