TalksAWS re:Invent 2025 - Fidelity: AWS Health & Support Data Pipelines to GenAI Actions (SPS318)

AWS re:Invent 2025 - Fidelity: AWS Health & Support Data Pipelines to GenAI Actions (SPS318)

Transforming Operational Challenges into Strategic Advantage with AWS Health, Support Data, and GenAI

Operational Challenges in the Cloud Era

  • Organizations face growing complexities and large volumes of AWS accounts, resources, and events
  • Reactive monitoring is no longer sufficient - proactive actions are needed to identify potential issues before they impact customers
  • Study shows organizations with comprehensive data monitoring detect incidents 3.5x faster than those without

Harnessing AWS Health and Support Data

  • AWS Health provides insights into service disruptions, scheduled changes, and account-specific notifications
  • AWS Support cases contain valuable information on problems, solutions, workarounds, and lessons learned
  • Key goals:
    1. Ingest AWS Health events at enterprise scale
    2. Ingest AWS Support case data
    3. Ingest and route data at scale
    4. Leverage data to drive proactive, data-driven operations

Fidelity's Cloud Event Notification Transport Service (SENSE)

  • Fidelity's journey from a simple pull-based system to an event-driven architecture
  • Challenges faced as Fidelity rapidly scaled its cloud footprint to 2,000 accounts and 5 million resources
  • Key features of SENSE:
    • Leverages AWS Health's delegated administrative account and Event Bridge integration
    • Enriches raw events with additional context and metadata
    • Provides personalized notification preferences for users
    • Integrates with Fidelity's internal incident management and communication systems
    • Resilient architecture with regional backups and failover capabilities

Applying GenAI to Enhance Operational Intelligence

  • Fidelity's "Machine Augmented Key Insights" (MAKI) framework:
    • Ingests and augments AWS Support case data with reference information
    • Analyzes event-level and aggregate-level patterns using different GenAI models
    • Provides summaries, recommendations, and remediation plans
  • Use cases:
    1. Identifying trends and correlating events across health and support data
    2. Providing guidance and recommendations for deprecating events (e.g., RDS certificate expiration)
    3. Automating remediation and enhancing operational resilience

Towards a Holistic, Data-Driven Operational Platform

  • Integrating additional data sources (observability, change management, cost) to enrich the operational view
  • Exploring the potential of aggregating signals from multiple cloud providers and on-premises environments
  • Leveraging public data sources (news, social media, weather, financial markets) to anticipate and proactively respond to potential issues

Key Takeaways

  • Comprehensive data monitoring and proactive actions are crucial for managing the complexity of cloud environments
  • Combining AWS Health and Support data provides valuable operational intelligence
  • Fidelity's SENSE platform demonstrates how to ingest, enrich, and route this data at enterprise scale
  • Applying GenAI techniques can further enhance operational intelligence and automate remediation
  • Continuously expanding the data sources and integrating them into a holistic, data-driven operational platform is the ultimate goal

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.