TalksAWS re:Invent 2025 - Nasdaq: Build resilient infrastructure for global financial services (HMC327)

AWS re:Invent 2025 - Nasdaq: Build resilient infrastructure for global financial services (HMC327)

Building Resilient Infrastructure for Global Financial Services

NASDAQ's Cloud Journey

  • NASDAQ has been at the forefront of innovation for decades, operating over 30 different exchanges globally.
  • NASDAQ is both a system operator and a provider of software solutions for over 130 marketplaces worldwide, delivering these as traditional software installs, managed services, and SaaS.
  • NASDAQ's cloud journey began around 15 years ago, starting with T+1 backups and progressing towards more real-time, mission-critical systems, including matching engines and market systems.
  • Key challenges include:
    • Ultra-low latency transactions measured in microseconds and nanoseconds, with millions of messages per second without any drops or losses.
    • Specific physical location requirements due to proximity and latency needs, as well as an ecosystem of customers bringing their own hardware.
    • High resiliency targets of 100% uptime, and regulatory oversight across multiple jurisdictions.

Disaster Recovery Strategies

  • NASDAQ has evaluated the disaster recovery spectrum, ranging from backup and restore to active-active multi-region setups.
  • The focus is on the far right of the spectrum, targeting critical systems that must remain live at all times and cannot experience downtime.

Hybrid Compute with AWS Outpost

  • NASDAQ utilizes AWS Outpost to place workloads in their data centers while still managing them using the EC2 API and Amazon infrastructure.
  • This setup includes:
    • Direct Connect circuits to the AWS region, with Outpost racks containing EC2 servers.
    • Customized ultra-low latency network interface cards and bare metal servers to enable performance tuning and hardware multicast for equal access.

Failure Domains

  • NASDAQ analyzes failure domains at multiple levels:
    • Server level: Running hot spares and AB pairs of software components on separate instances.
    • Rack level: Spreading AB pairs across multiple racks to avoid single-rack failures.
    • Site level: Maintaining separate Outpost instances in different AWS accounts and regions for the primary and secondary data centers.
    • Region level: Running critical systems in a hot-hot configuration across multiple AWS regions.
  • For each level of the failure domain hierarchy, NASDAQ has well-documented operational runbooks to ensure a coordinated and predictable response during a disaster.

Static Stability

  • NASDAQ designs for static stability to ensure the system can continue operating even in the face of network or infrastructure disruptions (e.g., a fiber cut).
  • Key aspects include:
    • Provisioning hot spare EC2 instances that can be activated without needing to interact with the EC2 control plane.
    • Carefully selecting which AWS services to deploy on Outpost to minimize dependencies.
    • Using local boot with direct access to the local gateway for "break glass" access, avoiding any hairpinning to the AWS region.

Disconnected Operation

  • NASDAQ extensively tests their ability to operate in a disconnected state, simulating scenarios where the connection to the AWS region is lost.
  • This involves understanding and mitigating any dependencies on services that may require connectivity to the region, such as SSL certificate renewals or AWS Identity and Access Management (IAM) checks.
  • The local gateway is a critical access point to maintain control and management of the systems during a disconnected state.

Future Enhancements

  • NASDAQ is exploring more dynamic deployment capabilities, including affinity and anti-affinity rules to optimize resource utilization and resiliency.
  • They are working closely with AWS on the next generation of Graviton CPUs and exploring the use of accelerated compute, such as FPGAs and GPUs, to enhance their offerings.
  • NASDAQ is also looking to expand the set of AWS services they can safely run on Outpost, particularly focusing on EKS local clusters to simplify management.

Key Takeaways

  • NASDAQ has built a highly resilient, low-latency infrastructure using AWS Outpost to host mission-critical financial systems in their own data centers.
  • Their approach emphasizes failure domain analysis, static stability, and the ability to operate in a disconnected state, ensuring continuous availability even in the face of disruptions.
  • NASDAQ's collaboration with AWS on hardware and service offerings is driving innovation to meet the unique requirements of the financial services industry.
  • The strategies and architectural patterns presented can be applied to other organizations with similarly demanding availability and performance requirements.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.