Talks AWS re:Invent 2025 - Designing mission critical applications with serverless services (CNS362) VIDEO
AWS re:Invent 2025 - Designing mission critical applications with serverless services (CNS362) Modernizing Mission-Critical Applications with AWS Serverless
Distributed Systems as Living, Evolving Entities
Distributed systems are not static, but rather living, evolving entities that require ongoing attention and care
Key goals of embracing distributed systems:
Organizational scalability: Ability to plug in new teams or partners to accelerate delivery
Business agility: Ability to quickly adapt to internal or external changes
Faster feedback loops: Leverage metrics like DORA to succeed in a complex world
Reduced external dependencies: Decrease cognitive load and blast radius
Achieving Modularity Through Serverless
Modularity can be expressed through code (e.g., encapsulation, design patterns) but requires discipline
Modularity can also be achieved at the infrastructure level using serverless:
Ability to leverage synchronous and asynchronous programming as needed
Configuration-driven rather than code-driven, leveraging built-in serverless behaviors
Serverless enables focusing on business logic and value generation, not undifferentiated heavy lifting
Booking.com's Serverless Modernization Journey
Challenges with the Legacy Monolithic System
Slow deployments (monthly instead of weekly)
Lengthy testing cycles (tens of minutes)
Frequent rollbacks due to issues
Tight coupling and cross-dependencies leading to a "big ball of mud"
Unclear ownership and domain boundaries
Performance and scalability issues with on-premises infrastructure
Designing a Serverless Target Architecture
Prioritized goals: isolating unknown code, defining boundaries, removing coupling
Chose AWS serverless services (Lambda, Step Functions) to achieve modularity and observability
Addressed team skepticism about serverless through a proof-of-concept and gradual adoption
Key Architectural Decisions and Patterns
Leveraged Step Functions to orchestrate reusable Lambda functions into clear, isolated workflows
Used DynamoDB to temporarily store requests, with TTL to manage database size
Employed SNS and SQS for asynchronous, decoupled communication between components
Overcoming Challenges
Addressed Lambda cold starts and latency through provisioned concurrency and code optimizations
Developed a solution for creating personal AWS environments for each team
Worked with compliance team to address concerns about serverless in a mission-critical system
Testing and Deployment Approach
Utilized shadow traffic to identify functionality gaps and performance issues
Adopted an experimentation-based go-live strategy, gradually increasing traffic to the new system
Leveraged data science to determine optimal traffic division across payment timings
Key Benefits and Outcomes
Reduced onboarding time for new developers (from 4 months to 1 month)
Improved developer experience with faster debugging and incident response
Significantly reduced time-to-market for new features and capabilities
Enabled 100% of new reservations to be processed through the modernized serverless system
Serverless Migration Patterns and Strategies
Seven R's of migration: Retain, Retire, Relocate, Rehost, Replatform, Repurchase, Refactor
Importance of addressing the triangle of people, processes, and technology
Examples of modernization patterns:
Strangler Fig Pattern: Incremental modernization, routing traffic between legacy and new systems
Branch by Abstraction: Abstracting complexity of dual implementations
Decompose by Business Capability: Aligning architectural boundaries with business functions
Decompose by Subdomain: Tailoring logic based on specific business requirements (e.g., regional)
Decompose by Transactions: Decoupling user experiences based on immediate vs. long-running actions
Rethinking Communication Patterns with Serverless
Challenges with synchronous communication in high-load scenarios (e.g., gift code validation)
Leveraging asynchronous patterns with Amazon DynamoDB, Amazon EventBridge, and AWS Lambda
Immediate synchronous feedback loop for customers
Asynchronous event propagation and processing for external validations and enrichment
Embracing eventual consistency over immediate consistency when the business context allows
Key Takeaways
Distributed systems are living, evolving entities that require ongoing attention and care
Serverless enables achieving modularity at the infrastructure level, allowing teams to focus on business value
Booking.com's serverless modernization journey demonstrates the benefits of increased agility, developer productivity, and time-to-market
Serverless migration patterns and strategies (e.g., Strangler Fig, Decompose by Capability) can help organizations modernize incrementally
Rethinking communication patterns with asynchronous serverless services can unlock improved user experiences and system resilience
Your Digital Journey deserves a great story. Build one with us.