TalksAWS re:Invent 2025 - Advanced RAG Architectures: From Basic Retrieval to Agentic RAG (NTA403)

AWS re:Invent 2025 - Advanced RAG Architectures: From Basic Retrieval to Agentic RAG (NTA403)

Advanced RAG Architectures: From Basic Retrieval to Agentic RAG

Overview

  • Presenters: Vive Mittal and Palvi Nargun, AWS Solutions Architects
  • Focus: Improving the accuracy of Retrieval Augmented Generation (RAG) architectures
  • Audience: Experienced RAG users looking to enhance their systems

Key Challenges with RAG Architectures

  • As RAG applications become more complex, standard retrieval and generation techniques may not be sufficient
  • Issues can arise with:
    • Large document volumes
    • Interconnected/complex documents
    • Proprietary data and abbreviations
    • Varied user query patterns

Amazon Bedrock Knowledge Bases

  • Managed service for end-to-end RAG workflows
  • Provides:
    • Embedding models
    • Chunking strategies
    • Vector stores (OpenSearch, S3, etc.)
    • Language models for generation

Advanced RAG Techniques

  1. Conditional Branching:

    • Intelligently select which vector store(s) to query based on the user's question
    • Example: Routing product-specific vs. policy-related queries to different data sources
  2. Parallel Branching:

    • Retrieve and combine data from multiple sources to provide a comprehensive response
    • Example: Identifying root cause, finding fix instructions, and checking inventory for a manufacturing issue
  3. Query Reformulation:

    • Break down complex, multi-part queries into smaller, more manageable sub-queries
    • Retrieve relevant chunks, rank, and synthesize the final response

Self-Corrective Agentic RAG

  • Central agent orchestrates the RAG workflow
  • Iteratively checks:
    1. Relevance of retrieved chunks to the original query
    2. Quality and completeness of the generated response
  • Selects and applies the appropriate technique(s):
    • Basic RAG
    • Query expansion
    • Query decomposition
    • Retrieve document
    • Evaluate response quality

Additional RAG Optimization Techniques

  1. Injection Flow Enhancements:

    • Chunking strategies (fixed, semantic, hierarchical)
    • Foundational model parsing for multi-modal content
    • Metadata labeling for targeted retrieval
  2. Retrieval Flow Enhancements:

    • Metadata filtering
    • Chunk reranking
    • Hybrid search (semantic + keyword)

Business Impact and Use Cases

  • Improved accuracy and relevance of RAG-powered applications
  • Enables more complex, enterprise-grade use cases:
    • Intelligent customer support
    • Autonomous decision-making systems
    • Knowledge-intensive business processes

Key Takeaways

  • RAG architectures require a multi-faceted approach to achieve high accuracy
  • Techniques like conditional/parallel branching, query reformulation, and self-corrective agentic RAG can significantly enhance performance
  • Optimizing both the injection and retrieval flows is crucial for overall RAG system improvement
  • Advanced RAG techniques enable more sophisticated, enterprise-ready applications that can handle complex queries and data

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.