Scaling technology for millions of cricket fans on AWS (MAE309)

Here's a detailed summary of the key points from the video transcription, formatted in Markdown:

Key Takeaways

Introduction

  • The session discusses how Dream11 has leveraged AWS to enhance the experience of the extremely popular game of cricket in India.
  • The speakers are Satender Singh, Solutions Architect at AWS in India, Shent Gupta, VP of Engineering at Dream11, and Raj Chri, also VP of Engineering at Dream11.

Factors Contributing to Dream11's Scale

  • India has a young population, with 65% below 35 years of age.
  • High internet penetration, with 900 million users out of 1.3 billion population.
  • Affordable data, with 12 cents per GB.
  • Cricket's immense popularity in India, with 500 million viewers for the last Cricket World Cup.
  • The rise of fantasy sports, with 490 million online game players, a substantial portion of which are on Dream11.

Dream11's Scale and Metrics

  • Dream11 is the world's largest fantasy sports platform, with over 220 million users.
  • The platform added more than 55 million users in the last year alone, which is 15% of the U.S. population.
  • Dream11's peak concurrency is 15 million users, and it processes 376 million concurrent requests during the Indian Premier League (IPL).

Challenges Faced by Dream11

  1. Scale:

    • 46,000 EC2 instances, 90% of which are spot instances.
    • 163 Application Load Balancers serving more than 145 million requests per minute.
    • 166 Aurora MySQL instances, with one cluster handling 1 million requests per minute.
    • Over 50 ElastiCache clusters supporting more than 4 million requests per minute on a single cluster.
  2. Cost:

    • Balancing the cost and availability of spot instances.
    • Experimenting with different allocation strategies, such as capacity-optimized and price-capacity-optimized.
    • Utilizing 140 types of spot instances and exploring attribute-based selection.
  3. Security:

    • Implementing a defense-in-depth approach, including Amazon Inspector, GuardDuty, and Web Application Firewall.
    • Detecting and temporarily blocking malicious traffic in real-time using automation.

Architectural Challenges and Solutions

  1. Impulse Traffic:

    • Surge in traffic, from 60 million to 270 million requests per minute in 3 minutes, during the start of a match.
    • Solution: A system called "Scaler" that predicts the upcoming traffic and scales the systems accordingly.
  2. Real-time Leaderboard:

    • Updating the leaderboard for 20 million teams every few seconds.
    • Solution: Leveraging Apache Spark Streaming to process the data in real-time and efficiently handle the scale.
  3. Resilience and Failure Handling:

    • Mechanisms like back-pressure, circuit breakers, and sharding to handle failures and minimize impact.
    • Example: Active-passive setup for the critical Team Service to ensure quick recovery.
  4. Personalization at Scale:

    • Challenges: Evolving user preferences, relevancy over recency, and batch predictions.
    • Solution: An in-house system called "Darwin" that uses Ray Racer for real-time personalization of match recommendations for 15 million+ users.

Future Roadmap

  1. Investing in building platform solutions for business growth experiments and supporting more sports and formats.
  2. Continuously improving to handle higher concurrency and reducing operational overhead of managing multi-sharded data stores.
  3. Exploring the use of AI/ML for test case generation and synthetic data creation to improve quality.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us