AWS re:Invent 2025 - Agents in the enterprise: Best practices with Amazon Bedrock AgentCore(AIM3310)

Scaling Agents in the Enterprise: Best Practices with Amazon Bedrock AgentCore

Introduction

Presenters: Costivasakis (Product Management Lead on AgentCore) and Lera Tankke (Tech Lead on Aentic AI team)

Objective: Discuss best practices for taking agent-based applications from proof-of-concept to production at scale

The Challenge of Moving from Proof-of-Concept to Production

Customers describe a "PC to production chasm" - it's difficult to go from a demo to a production application that scales across users and provides the necessary governance

Key capabilities required:

Accuracy: Agents need to work well with real users, whose behavior may differ from developer expectations
Scalability: Agents must scale across users and domains while maintaining personalization
Secure Memory: Agents must securely handle memory across users and sessions
Cost Control: Hosting infrastructure and token usage for agents can be expensive, requiring cost observability
Observability: Detailed observability is needed to understand agent behavior and performance
Monitoring: Continuous monitoring is required to detect and address agent drift over time

Overview of Amazon Bedrock AgentCore

Runtime: Secure, serverless hosting engine for tools and agents, supporting real-time and long-running use cases

Memory: Provides short-term and long-term memory capabilities to maintain context across user sessions

Gateway: Exposes internal APIs and services to agents, with identity and access control

Identity: Integrates with workforce credentials (e.g. Okta, Cognito) to manage access to agents and tools

Policy: Allows defining rules to control access and actions for agents and tools

Tools: Provides pre-built components like a browser, code interpreter, and observability dashboards

Best Practices for Scaling Agents

Start Small, Think Big: Define a specific use case, create a proof-of-concept, and iterate quickly to validate what works

Implement Observability from the Start: Use open-telemetry compatible traces to understand agent behavior, with dashboards for monitoring

Expose Tools and APIs to Agents: Provide clear descriptions and parameters for tools, handle errors and retries, and reuse existing MCP servers

Leverage Evaluations to Improve Agents: Define success metrics (both technical and business-oriented) and continuously evaluate agent performance

Adopt a Multi-Agent Architecture: Break down complex agents into specialized components to improve accuracy, speed, and cost-effectiveness

Scale Agents Securely and Personalized: Isolate user contexts and sessions, use per-user memory, and enforce access policies

Leverage Code for Deterministic Tasks: Use code for calculations, validations, and other deterministic logic, reserving agents for reasoning and orchestration

Test, Test, and Test Again: Implement continuous testing pipelines, use A/B testing, and monitor for performance drift in production

Clearwater Analytics' Experience with AgentCore

Clearwater Analytics is a public fintech company providing financial accounting and reporting for institutional investors

They were early adopters of agent-based solutions, starting in 2023

Key use cases:

Internal knowledge base and SOP assistance
Salesforce ticket support
Accounting data analysis, anomaly detection, and visualization
Automated coding and code review
Financial data intake from PDFs

Challenges they faced:

Scalability, zero-downtime deployments, and avoiding "noisy neighbors"
Maintaining rapid follow-ups and context
Preserving existing custom features and integrations

Why they chose AgentCore:

Zero-downtime deployments and flexible technology stack
Isolated sessions and memory management
Ease of creating MCP servers for data access

Best Practices Learned:

Context is King: Ensure agents have unambiguous context to avoid hallucinations
Manage User Interactions: Use clarification in chat, and output confidence/rationale in automated workflows
Rollout Strategically: Identify user pain points, build narrow use cases, and continuously monitor and iterate

Key Takeaways

Agents require a robust infrastructure to scale effectively in the enterprise, addressing accuracy, scalability, security, cost, observability, and monitoring

Amazon Bedrock AgentCore provides a modular, managed platform to host and operate agent-based applications at scale

Best practices include starting small, implementing observability, exposing tools, using evaluations, adopting multi-agent architectures, scaling securely, leveraging code, and continuous testing

Clearwater Analytics' experience demonstrates the real-world application of these principles, highlighting the importance of context, user interaction management, and strategic rollout

AWS re:Invent 2025 - Agents in the enterprise: Best practices with Amazon Bedrock AgentCore(AIM3310)

Scaling Agents in the Enterprise: Best Practices with Amazon Bedrock AgentCore

Introduction

The Challenge of Moving from Proof-of-Concept to Production

Overview of Amazon Bedrock AgentCore

Best Practices for Scaling Agents

Clearwater Analytics' Experience with AgentCore

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Agents in the enterprise: Best practices with Amazon Bedrock AgentCore(AIM3310)

Scaling Agents in the Enterprise: Best Practices with Amazon Bedrock AgentCore

Introduction

The Challenge of Moving from Proof-of-Concept to Production

Overview of Amazon Bedrock AgentCore

Best Practices for Scaling Agents

Clearwater Analytics' Experience with AgentCore

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.