TalksAWS re:Invent 2025 - Advanced agentic RAG Systems: Deep dive with Amazon Bedrock (AIM425)

AWS re:Invent 2025 - Advanced agentic RAG Systems: Deep dive with Amazon Bedrock (AIM425)

Advanced Agentic RAG Systems: Deep Dive with Amazon Bedrock

Overview

This presentation provided a deep dive into the technical implementation of an "event agent" - an intelligent assistant capable of retrieving session information from a knowledge base and storing/retrieving conversation history to personalize the user experience. The key components covered were:

  1. Knowledge Base Integration
  2. Agent Memory and Personalization
  3. Secure Production Deployment with Agent Core Runtime
  4. Identity Management with Agent Core Identity

Knowledge Base Integration

  • The presenters used AWS Bedrock Knowledge Bases to ingest and index session data for the AWS re:Invent 2024 conference
  • This involved:
    • Downloading and splitting session data into 583 individual documents
    • Uploading documents to an S3 bucket
    • Creating an S3 vector store and index using 1024-dimensional Titan text embeddings
    • Configuring and creating the Bedrock Knowledge Base, specifying the embedding model and storage details
    • Ingesting the data into the knowledge base, which took ~2 minutes
  • The presenters created a custom "knowledge base search tool" that the agent can use to query the knowledge base and retrieve relevant session information using semantic search

Agent Memory and Personalization

  • To provide a personalized experience, the agent utilizes both short-term and long-term memory:
    • Short-term memory stores the raw conversation history between the user and agent
    • Long-term memory extracts and consolidates user preferences, interests, and other relevant insights from the conversation history
  • The presenters used Agent Core Memory, a managed service, to implement this memory functionality:
    • Short-term memory stores all conversation messages, identified by actor ID and session ID
    • Long-term memory uses configurable "strategies" (e.g. user preferences, semantic summaries) to extract and consolidate insights from the short-term history
    • Memory is organized into namespaces based on configurable metadata (e.g. actor ID, session ID, strategy ID) for better retrieval and access control
  • When the agent is initialized, it retrieves the user's long-term preferences from memory to personalize the experience
  • As new messages are added, they are stored in short-term memory and automatically propagated to long-term memory

Secure Production Deployment

  • To deploy the agent in a production environment, the presenters used AWS Agent Core Runtime:
    • Agent Core Runtime provides a serverless, scalable runtime for running agent workloads with true session isolation
    • It automatically provisions a separate micro-VM for each user session, ensuring complete isolation of compute, memory, and storage
    • The presenters demonstrated how to package the agent code into a container image and deploy it to Agent Core Runtime
  • To secure access to the agent, the presenters used AWS Agent Core Identity:
    • Users authenticate through an identity provider (e.g. Amazon Cognito)
    • The agent retrieves the user's identity token and uses Agent Core Identity to validate it before allowing access
    • This ensures only authorized users can interact with the agent, and their identity is propagated to the agent code

Key Takeaways

  • Agentic RAG (Retrieval Augmented Generation) systems can be built using a combination of knowledge bases, agent memory, and secure production infrastructure
  • Knowledge bases provide a way to ingest and index structured data (e.g. event session information) for semantic retrieval
  • Agent memory, with both short-term and long-term components, enables personalization by storing and extracting insights from user conversations
  • Secure production deployment with Agent Core Runtime and Agent Core Identity ensures scalability, isolation, and access control for mission-critical agent applications

Technical Details

  • AWS Bedrock Knowledge Bases
    • Used to ingest and index 583 documents on AWS re:Invent 2024 sessions
    • Leveraged 1024-dimensional Titan text embeddings for semantic search
  • Agent Core Memory
    • Provided short-term storage of raw conversation messages
    • Extracted long-term insights using configurable "strategies" (e.g. user preferences, semantic summaries)
    • Organized memory into namespaces based on metadata like actor ID and session ID
  • Agent Core Runtime
    • Serverless, scalable runtime for running agent workloads
    • Provisioned separate micro-VMs for each user session to ensure isolation
  • Agent Core Identity
    • Validated user identity tokens from identity providers (e.g. Amazon Cognito)
    • Propagated user identity information to the agent code

Business Impact

  • The presented agentic RAG system enables a highly personalized event assistant experience, tailoring session recommendations and information retrieval to each individual user
  • By integrating knowledge bases, agent memory, and secure production infrastructure, the solution can be deployed at scale to support large-scale event applications
  • The ability to persistently store and retrieve user preferences and conversation history allows the agent to build a deeper understanding of each user over time, further enhancing the personalized experience

Examples

  • The event agent was able to retrieve relevant session information from the knowledge base based on the user's query (e.g. "What sessions are there on security and AI?")
  • The agent leveraged the user's long-term preferences, extracted from previous conversations, to provide personalized session recommendations (e.g. "Based on your interest in cloud architecture, AI, and security, here are some recommended sessions...")
  • The secure production deployment using Agent Core Runtime and Agent Core Identity ensured that each user's data and interactions were isolated and accessible only to authorized parties.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.