AWS re:Invent 2025 - Build and scale AI: from reliable agents to transformative systems (INV204)

Building Trusted AI Agents at Scale

Importance of Trust in AI Systems

Building AI agents that can be trusted for production systems is a key challenge

Lack of reliability, transparency, and safety can turn the most brilliant algorithms into expensive experiments

Trust must be built into AI systems from the ground up

Key Pillars of Trusted AI Agents

Reliability:

Importance of building on a secure, extensive, and reliable global cloud infrastructure
AWS offerings like EC2 instances, Trainium chips, and co-designed hardware/software for faster, safer, and more efficient AI workloads
Customization capabilities to align AI models with business needs and domain-specific requirements

Transparency:

Importance of observability and visibility into AI agent behavior, performance, and decision-making
Amazon SageMaker HyperPod's built-in observability for ML infrastructure and workflows
Agent Core observability to trace agent actions, replay workflows, and audit agent behavior

Safety and Governance:

Importance of setting clear guidelines and policies for AI agent behavior
AWS Responsible AI Lens to guide best practices for secure, compliant, and ethical AI deployment
Agent Core's identity management, sandboxing, and policy controls to enforce data and access restrictions

Ease of Use:

Importance of making AI agent development accessible to a wide range of users
Strands: an open-source, model-driven framework for building and running AI agents
Genai Innovation Center to help organizations move from prototypes to production-ready AI systems

Real-World Examples and Impact

Cohere Health:

Built an agentic system using Bedrock and Agent Core to automate medical coverage reviews
Achieved 30-40% faster reviews with fewer errors, providing faster answers to patients

Lyft:

Transformed customer support experience with AI-powered intent detection and resolution
Achieved 55% of customer interactions resolved without human agents
Reduced average resolution time from 16 days to under 3 minutes

Key Takeaways

Trust is essential for scaling AI agents in production environments

AWS provides a comprehensive set of tools and services to build reliable, transparent, safe, and easy-to-use AI agents

Successful AI agent deployment requires aligning technology capabilities with business needs and user experience

Partnerships and collaboration are crucial for overcoming challenges and driving real-world impact

Building Trusted AI Agents at Scale