TalksAWS re:Invent 2025 - Designing mission critical applications with serverless services (CNS362)

AWS re:Invent 2025 - Designing mission critical applications with serverless services (CNS362)

Modernizing Mission-Critical Applications with AWS Serverless

Distributed Systems as Living, Evolving Entities

  • Distributed systems are not static, but rather living, evolving entities that require ongoing attention and care
  • Key goals of embracing distributed systems:
    • Organizational scalability: Ability to plug in new teams or partners to accelerate delivery
    • Business agility: Ability to quickly adapt to internal or external changes
    • Faster feedback loops: Leverage metrics like DORA to succeed in a complex world
    • Reduced external dependencies: Decrease cognitive load and blast radius

Achieving Modularity Through Serverless

  • Modularity can be expressed through code (e.g., encapsulation, design patterns) but requires discipline
  • Modularity can also be achieved at the infrastructure level using serverless:
    • Ability to leverage synchronous and asynchronous programming as needed
    • Configuration-driven rather than code-driven, leveraging built-in serverless behaviors
    • Serverless enables focusing on business logic and value generation, not undifferentiated heavy lifting

Booking.com's Serverless Modernization Journey

Challenges with the Legacy Monolithic System

  • Slow deployments (monthly instead of weekly)
  • Lengthy testing cycles (tens of minutes)
  • Frequent rollbacks due to issues
  • Tight coupling and cross-dependencies leading to a "big ball of mud"
  • Unclear ownership and domain boundaries
  • Performance and scalability issues with on-premises infrastructure

Designing a Serverless Target Architecture

  • Prioritized goals: isolating unknown code, defining boundaries, removing coupling
  • Chose AWS serverless services (Lambda, Step Functions) to achieve modularity and observability
  • Addressed team skepticism about serverless through a proof-of-concept and gradual adoption

Key Architectural Decisions and Patterns

  • Leveraged Step Functions to orchestrate reusable Lambda functions into clear, isolated workflows
  • Used DynamoDB to temporarily store requests, with TTL to manage database size
  • Employed SNS and SQS for asynchronous, decoupled communication between components

Overcoming Challenges

  • Addressed Lambda cold starts and latency through provisioned concurrency and code optimizations
  • Developed a solution for creating personal AWS environments for each team
  • Worked with compliance team to address concerns about serverless in a mission-critical system

Testing and Deployment Approach

  • Utilized shadow traffic to identify functionality gaps and performance issues
  • Adopted an experimentation-based go-live strategy, gradually increasing traffic to the new system
  • Leveraged data science to determine optimal traffic division across payment timings

Key Benefits and Outcomes

  • Reduced onboarding time for new developers (from 4 months to 1 month)
  • Improved developer experience with faster debugging and incident response
  • Significantly reduced time-to-market for new features and capabilities
  • Enabled 100% of new reservations to be processed through the modernized serverless system

Serverless Migration Patterns and Strategies

  • Seven R's of migration: Retain, Retire, Relocate, Rehost, Replatform, Repurchase, Refactor
  • Importance of addressing the triangle of people, processes, and technology
  • Examples of modernization patterns:
    • Strangler Fig Pattern: Incremental modernization, routing traffic between legacy and new systems
    • Branch by Abstraction: Abstracting complexity of dual implementations
    • Decompose by Business Capability: Aligning architectural boundaries with business functions
    • Decompose by Subdomain: Tailoring logic based on specific business requirements (e.g., regional)
    • Decompose by Transactions: Decoupling user experiences based on immediate vs. long-running actions

Rethinking Communication Patterns with Serverless

  • Challenges with synchronous communication in high-load scenarios (e.g., gift code validation)
  • Leveraging asynchronous patterns with Amazon DynamoDB, Amazon EventBridge, and AWS Lambda
    • Immediate synchronous feedback loop for customers
    • Asynchronous event propagation and processing for external validations and enrichment
  • Embracing eventual consistency over immediate consistency when the business context allows

Key Takeaways

  • Distributed systems are living, evolving entities that require ongoing attention and care
  • Serverless enables achieving modularity at the infrastructure level, allowing teams to focus on business value
  • Booking.com's serverless modernization journey demonstrates the benefits of increased agility, developer productivity, and time-to-market
  • Serverless migration patterns and strategies (e.g., Strangler Fig, Decompose by Capability) can help organizations modernize incrementally
  • Rethinking communication patterns with asynchronous serverless services can unlock improved user experiences and system resilience

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.