TalksAWS re:Invent 2025 - FIS: High-performance instant payment processing at massive scale (IND3318)

AWS re:Invent 2025 - FIS: High-performance instant payment processing at massive scale (IND3318)

Transforming Payments at Scale: FIS and AWS Collaboration

Overview

  • FIS, a leading global financial technology provider, collaborated with AWS to rethink payment solutions from the ground up and build a high-performance, massively scalable payment processing platform called the "Money Movement Hub".
  • The session covers the key requirements, technical challenges, architectural decisions, and the innovative solutions implemented to address scalability, performance, resiliency, and observability.

The Changing Payments Landscape

  • Payments have become an integral part of daily life, with increasing customer expectations for faster, more seamless, and always-on experiences.
  • Traditional payment rails and batch-oriented banking systems are struggling to keep up with the demand for instant, programmable, and intelligent payment processing.
  • FIS recognized the need to rethink payments from the ground up, moving away from siloed payment systems towards a unified, orchestrated, and intelligent payment processing platform.

Key Technical Requirements

  1. Performance: Ability to execute and settle payments in under 5 seconds, meeting the requirements of instant and real-time payment schemes.
  2. Scalability: Designed as a multi-tenant solution, capable of processing over 1,000 payments per second to handle fluctuating payment volumes.
  3. Availability: Targeted 99.995% availability, allowing for less than 30 minutes of downtime per year.

Technical Challenges

  • Seamless integration with existing on-premises banking systems and legacy technologies.
  • Ability to easily extend the system to support new payment rails, such as digital currencies and stable coins, without a complete overhaul.
  • Ensuring end-to-end observability across the complex payment processing pipeline.

Architectural Approach

  • Adopted a multi-account strategy on AWS to isolate different subsystems and enable independent teams to work with different operating models.
  • Implemented an event-driven, microservices-based architecture orchestrated by Kubernetes (EKS) for high scalability and resiliency.
  • Leveraged AWS services for high-performance compute (EKS), database (Amazon Aurora), caching (Amazon ElastiCache), and data processing (Amazon MSK, AWS Glue).
  • Utilized open-source technologies like Cadence, Conductor, and Kubernetes autoscalers to enhance scalability, performance, and workflow management.
  • Designed for end-to-end observability using AWS CloudWatch, including the innovative use of CloudWatch Application Signals for service-level objective monitoring.

Key Technical Innovations

  1. Scalability:

    • Used Kubernetes (EKS) with Cadence and Carpenter for dynamic pod and node scaling to handle fluctuating payment volumes.
    • Leveraged warm pools and custom VPC CNI plugins to ensure immediate pod schedulability and IP address management.
  2. Performance:

    • Optimized container startup times using Bottlerocket and multi-stage Docker builds.
    • Utilized custom Spring annotations to route read queries to Aurora read replicas and asynchronous writes to the cache.
    • Evaluated Spring Ahead of Time and GraalVM for further performance improvements.
  3. Resiliency:

    • Leveraged native cross-region replication capabilities of AWS services like Aurora, ElastiCache, and S3.
    • Used MSK Replicator to replicate Kafka topics and consumer offsets for seamless failover.
    • Implemented fault injection testing using the AWS Fault Injection Simulator.
  4. Observability:

    • Adopted OpenTelemetry with the AWS distro to capture logs, metrics, and traces, and send them to a centralized observability account.
    • Utilized CloudWatch Application Signals to monitor service-level objectives (SLOs) and provide end-to-end visibility.

Business Impact and Outcomes

  • Reduced time-to-market from concept to production from 9 months, a significant improvement over FIS's traditional payment system deployments.
  • Enabled rapid deployment of new payment rails and capabilities, with 11 major initiatives planned for the next year.
  • Empowered financial institutions with self-service capabilities and automated decision-making through the use of AI and machine learning.
  • Facilitated the decommissioning of legacy payment systems and the migration of customers to the new Money Movement Hub platform.

Key Lessons Learned

  1. Event-Driven Architecture: Start with an event-driven, microservices-based design to achieve massive horizontal scalability and resiliency.
  2. Performance Optimization: Identify and optimize the critical real-time processing pipeline, leveraging high-performance compute, database, and caching services.
  3. Resilient by Design: Leverage native multi-region and multi-AZ capabilities of AWS services to build a highly resilient solution.
  4. Unified Observability: Implement end-to-end observability, including both technical and business-level metrics, to gain deep visibility into the system's health and performance.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.