AWS re:Invent 2025 - Build serverless chatbots using Amazon ElastiCache & Aurora PostgreSQL (DAT314)

Building Serverless Chatbots with Amazon ElastiCache and Aurora PostgreSQL

Overview

The presentation discusses how to build scalable, high-performance serverless chatbots using Amazon ElastiCache and Aurora PostgreSQL.

The case study focuses on Flightly, a fictional travel platform that allows customers to search flights, book hotels, and plan vacations.

Flightly's Initial Architecture and Challenges

Flightly initially built an MVP with a simple chatbot architecture, but as user adoption grew, they faced performance issues.

The system was hitting database bottlenecks, with 30-second average response times for a single booking, leading to a 50% user abandonment rate.

This resulted in an estimated $50 million in annual lost revenue due to slow response times.

The Need for Caching and Semantic Search

To address the performance issues, Flightly decided to keep Aurora PostgreSQL as the source of truth and add a caching layer.

Caching frequently asked questions, baggage policies, and booking templates in Amazon ElastiCache can provide sub-millisecond response times.

Flightly also implemented semantic search using Aurora PostgreSQL's PGVector extension, which allows for vector-based embeddings and similarity searches.

This enables the chatbot to understand user intent beyond just keyword matching, providing more relevant and personalized responses.

Architectural Patterns

Context Cache: Storing user context (chat history, session state, preferences) in ElastiCache for quick retrieval.

Embedding Cache: Caching vector embeddings in ElastiCache to skip the expensive embedding generation process for repeat queries.

Durable Semantic Caching: Caching semantic search results in ElastiCache to avoid recomputing vector distances for similar queries.

Tiered Memory Management: Using ElastiCache for short-term memory (chat messages, session state) and Aurora PostgreSQL for long-term memory (episodic recall, user preferences).

Scaling to Production with Bedrock Agent Core

To scale the chatbot architecture to a million daily queries, Flightly leveraged Bedrock Agent Core, a fully managed platform for building and deploying AI agents.

Bedrock Agent Core provides a runtime, identity management, gateway, and observability layer to run agents securely at scale.

This allows Flightly to run their agents in a scalable, highly available, and secure cloud environment, without having to manage the underlying infrastructure.

Business Impact

By implementing the caching and semantic search patterns, Flightly was able to achieve sub-millisecond response times for frequently asked questions and sub-100ms response times for more complex queries.

This resulted in a 60% reduction in infrastructure costs and a 40% increase in customer retention, as users no longer abandoned the platform due to slow response times.

The agentic AI architecture enabled Flightly to expand beyond simple Q&A and build a more sophisticated chatbot that can handle booking flows, payment processing, and other complex tasks.

Key Takeaways

Caching is essential for building high-performance conversational AI systems, especially at scale.

Semantic search and vector embeddings can significantly improve the understanding of user intent beyond simple keyword matching.

A tiered memory architecture, with short-term memory in ElastiCache and long-term memory in Aurora PostgreSQL, can provide seamless context preservation and multi-agent workflows.

Leveraging a managed platform like Bedrock Agent Core can simplify the deployment and scaling of production-ready conversational AI systems.

AWS re:Invent 2025 - Build serverless chatbots using Amazon ElastiCache & Aurora PostgreSQL (DAT314)

Building Serverless Chatbots with Amazon ElastiCache and Aurora PostgreSQL

Overview

Flightly's Initial Architecture and Challenges

The Need for Caching and Semantic Search

Architectural Patterns

Scaling to Production with Bedrock Agent Core

Business Impact

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Build serverless chatbots using Amazon ElastiCache & Aurora PostgreSQL (DAT314)

Building Serverless Chatbots with Amazon ElastiCache and Aurora PostgreSQL

Overview

Flightly's Initial Architecture and Challenges

The Need for Caching and Semantic Search

Architectural Patterns

Scaling to Production with Bedrock Agent Core

Business Impact

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.