AWS re:Invent 2025 - Elevate application and generative AI observability (COP326)

Elevating Application and Generative AI Observability

Challenges with Complex Application Monitoring

Lack of visibility and observability in today's complex, AI-powered applications

Difficulty understanding how systems are performing and reacting to user dynamics

Need for comprehensive monitoring and observability to avoid "guessing" about application behavior

Comprehensive Application Monitoring with Amazon CloudWatch

CloudWatch provides native integration and a single pane of glass for monitoring across multiple layers:

Infrastructure (EC2, containers, serverless, on-premises)
Application (logs, traces, service-level objectives)
Database
User experience (real user monitoring, synthetic monitoring)

Leverages the three pillars of observability: metrics, logs, and traces

Monitoring Application Health with Golden Signals

Key metrics to monitor:

Request volume
Latency
Errors and faults

Tying these technical metrics to business impact:

Revenue per minute
Page load time
API error codes
Session duration

Establishing service-level objectives (SLOs) to maintain optimal application health

Enhancing Observability with Amazon CloudWatch Application Insights

Automatically discovers applications and provides pre-built dashboards for key metrics

Enables easy root cause analysis for issues like HTTP errors and exceptions

Allows defining SLOs and tying them to business-level service-level agreements (SLAs)

Monitoring Generative AI Workloads

Challenges with observing AI-powered applications:

Indeterministic agent behavior
Difficulty tracing and analyzing the sequence of AI model invocations
Assessing system health and quality of AI responses

Amazon Genai Observability capabilities:

360-degree view of AI agents across different frameworks
Simple instrumentation using OpenTelemetry
End-to-end prompt tracing and data protection
Continuous evaluation of AI response quality

Integrating Observability for AI Agents on AWS

Leveraging AWS services like Amazon Bedrock and Amazon Agent Core to build and deploy AI agents

Instrumenting AI workloads using OpenTelemetry to send telemetry data to Amazon CloudWatch

Utilizing CloudWatch's pre-built dashboards and capabilities to monitor AI agent performance, quality, and behavior

Key Takeaways

Comprehensive application monitoring is crucial for understanding complex, AI-powered systems

Amazon CloudWatch provides a unified observability platform to monitor applications across all layers

Establishing SLOs and tying them to business metrics enables proactive management of application health

Observability for generative AI workloads requires new capabilities to understand agent behavior, quality, and reasoning

AWS provides a full stack of services and tools to build, deploy, and observe AI-powered applications

AWS re:Invent 2025 - Elevate application and generative AI observability (COP326)

Elevating Application and Generative AI Observability

Challenges with Complex Application Monitoring

Comprehensive Application Monitoring with Amazon CloudWatch

Monitoring Application Health with Golden Signals

Enhancing Observability with Amazon CloudWatch Application Insights

Monitoring Generative AI Workloads

Integrating Observability for AI Agents on AWS

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Elevate application and generative AI observability (COP326)

Elevating Application and Generative AI Observability

Challenges with Complex Application Monitoring

Comprehensive Application Monitoring with Amazon CloudWatch

Monitoring Application Health with Golden Signals

Enhancing Observability with Amazon CloudWatch Application Insights

Monitoring Generative AI Workloads

Integrating Observability for AI Agents on AWS

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.