Talks AWS re:Invent 2025 - Observability for AI Agents and Traditional Workloads (COP335) VIDEO
AWS re:Invent 2025 - Observability for AI Agents and Traditional Workloads (COP335) Observability for AI Agents and Traditional Workloads
Key Challenges Addressed
Application Complexity : Increased use of microservices and AI agents creating new dependencies and behaviors that are difficult to understand.
Standardization : Lack of consistency in metrics, logs, and transaction representation across services and teams.
Prioritization : Difficulty distinguishing customer-impacting incidents from less critical anomalies.
Disjoint Experiences : Multiple disconnected tools for logs, traces, and metrics, making it hard to correlate issues.
Increased Entropy : AI agents can introduce new dependencies and behaviors that further increase system complexity and unpredictability.
Application Signals: AWS's Observability Solution
Provides discovery of services, dependencies, and standardized operational practices.
Allows defining digital journeys (e.g. mobile user logins, payments) to prioritize monitoring.
Correlates metrics, logs, and traces in a unified experience for faster root cause analysis.
Collects telemetry using OpenTelemetry for managed services and customer applications.
Key Features and Enhancements
Complete Application Picture :
Automatically discovers dependencies and topology even for uninstrumented services.
Reduces manual work to understand application structure.
Organized Observability :
Automatically detects and groups applications based on connectivity and attributes.
Allows organizing observability by business units, teams, etc. to match how users work.
Faster Issue Resolution :
Provides automatic operational audits and change impact analysis.
Enables quick identification of important, customer-impacting issues.
AI Productivity Integration :
Connects observability data to users' own AI/ML tools and agents.
Provides operational insights and audits to enhance AI agent productivity.
AI Agent Observability
Instruments AI agents using OpenTelemetry to capture their interactions and behaviors.
Allows visibility into how AI agents are using databases, responding to users, etc.
Mobile Observability
Expanded Real User Monitoring to cover iOS and Android mobile apps.
Provides visibility into mobile app crashes, user impact analysis, and source-level debugging.
Uninstrumented Service Observability
Automatically discovers and maps dependencies for services without instrumentation.
Supports automatic instrumentation for EKS clusters and other managed services.
Cross-Account Observability
Provides a unified view of applications and dependencies across multiple AWS accounts.
Allows tracking shared resources and cross-account interactions.
AI Ops Integration
Provides MCP (Model Correctness Principle) servers to enhance AI agent productivity.
Integrates with GitHub Actions to leverage production telemetry for developer workflows.
Business Impact and Customer Results
CCC (a leading auto claims provider) saw:
50% reduction in mean time to resolution (MTR)
40% cost savings
Improved developer, SRE, and testing team productivity
Proactive issue detection and resolution, leading to better system performance
Key Takeaways
AWS Application Signals provides comprehensive observability for complex, AI-powered applications.
Features like automatic discovery, organized observability, and fast issue resolution help teams be more productive.
Integrating observability data with AI agents and developer workflows further enhances productivity.
Customers like CCC have seen significant improvements in cost, efficiency, and system performance.
Your Digital Journey deserves a great story. Build one with us.