Talks AWS re:Invent 2025 - Build observable AI agents with Strands, AgentCore, and Datadog (AIM233) VIDEO
AWS re:Invent 2025 - Build observable AI agents with Strands, AgentCore, and Datadog (AIM233) Building Observable AI Agents with Strands, AgentCore, and Datadog
Observability Challenges in Modern Architectures
Modern architectures have led to an explosion in complexity:
Diverse technologies, multiple clouds, open-source frameworks, SaaS providers
Increasing ephemeral compute (e.g., serverless functions, containers)
Rapid rate of change
Adding AI agents further multiplies this complexity:
Agents are compound systems with vector stores, models, evals, orchestration
Agents operate with autonomy and can be non-deterministic
Accountability is shared across models, frameworks, orchestration, and tool calls
Key challenges in running agents at scale:
Reliability: Ensuring agents don't "hallucinate" and operate with quality
Troubleshooting: Complexity makes it difficult to identify the root cause of issues
Cost: Every model interaction consumes tokens, which can escalate quickly
Security and safety: Enforcing guardrails and secure operation
Building and Deploying Agents with AWS Strands and AgentCore
Strands Agents: An open-source Python and TypeScript SDK for building agent-based applications
Model-agnostic, supports multiple LLMs (OpenAI, Anthropic, Amazon Bedrock)
Includes support for MCP (Multi-Agent Communication Protocol) and A2A (Agent-to-Agent)
Deploying Agents with AWS AgentCore
AgentCore is a fully managed service that provides runtime, memory, tool gateway, and observability
Supports any agent framework and model, with isolated and secure compute environments
Provides short-term and long-term memory management, enabling agents to maintain context across sessions
Operationalizing Agents with the AWS Well-Architected Framework
Applying the AWS Well-Architected Framework's Generative AI Lens:
Operational Excellence:
Collect metrics, user feedback, and functional performance data
Implement guardrails to detect policy violations, prompt injection, and PII exposure
Monitor success and latency of API/tool calls, and track costs per workflow, user, and model
Measure and report inference efficiencies to guide sustainable scaling
Security:
Enforce security policies and guardrails to ensure safe agent operation
Monitor for potential security threats, such as prompt injection or PII exposure
Reliability:
Ensure agents operate consistently and with high quality
Troubleshoot issues by analyzing agent reasoning paths and identifying root causes
Performance Efficiency:
Optimize agent performance by experimenting with different models and prompts
Monitor and manage agent resource utilization (e.g., token consumption)
Cost Optimization:
Track and manage the costs associated with agent operation, including model interactions and tool calls
Operationalizing Observability with Datadog
Integrating Strands Agents with Datadog for observability:
Strands Agents provide out-of-the-box telemetry (metrics, traces, logs) to Datadog
Datadog's AI Observability features enable:
Troubleshooting: Analyzing agent reasoning paths, identifying root causes of issues
Monitoring: Setting up alerts and automating actions based on key telemetry
Evaluation: Implementing pre-built and custom evaluations to assess agent quality
Experimentation: Comparing the performance of different models and prompts
Datadog's Trace Explorer provides visibility into agent behavior:
Observing input/output, tool calls, security/safety checks, and cost metrics
Identifying potential issues like PII exposure or prompt injection attempts
Datadog's Evaluation capabilities:
Pre-built evaluations for failure to answer, hallucination, input/output toxicity, etc.
Ability to create custom evaluations to assess brand voice, goal completion, and more
Datadog's Experimentation features:
Comparing the performance of different models and prompts
Measuring key metrics like information accuracy, user satisfaction, and cost
Key Takeaways
Observability is crucial for building reliable, secure, and cost-efficient AI agents at scale
AWS Strands Agents and AgentCore provide a framework for building and deploying agents
The AWS Well-Architected Framework's Generative AI Lens offers best practices for operationalizing agents
Datadog's AI Observability features enable comprehensive monitoring, troubleshooting, evaluation, and experimentation for agent-based applications
Your Digital Journey deserves a great story. Build one with us.