Unified Knowledge Access: Bridging Data with Generative AI Agents
Overview
This presentation from AWS re:Invent 2025 showcases a new approach to unifying access to structured and unstructured data using AI agents. The key focus is on building a charity chatbot application that can seamlessly query both relational databases and unstructured knowledge bases to provide comprehensive answers to user questions.
Architecture and Components
The application architecture consists of the following key components:
- UI Application: A Streamlit-based UI that allows users to ask questions, which are then processed by the AI agent.
- AI Agent: The core of the system, built using the Strand SDK, which orchestrates the interaction between the user, the structured data source, and the unstructured knowledge base.
- Structured Data Source: An Amazon Aurora Postgres database containing the charity's membership, donation, and campaign data.
- Unstructured Knowledge Base: A Bedrock knowledge base storing related campaign documents and information.
Bridging Structured and Unstructured Data
The presenters demonstrate how the AI agent leverages two key capabilities to bridge the structured and unstructured data sources:
- Retrieve Tool: A pre-built Strand SDK tool that allows the agent to query the Bedrock knowledge base and retrieve relevant information from the unstructured data.
- Custom SQL Query Tool: A custom tool built by the presenters that uses an LLM (in this case, the Nvidia model) to dynamically generate SQL queries based on the user's question and the database schema. This allows the agent to retrieve accurate data from the relational database.
Intelligent Query Processing
The key innovation is the way the AI agent intelligently determines which data source to query based on the user's question. The process involves:
- Schema Introspection: The agent first retrieves the database schema using a SQL query against the information schema, allowing it to understand the structure of the underlying data.
- Prompt Generation: The agent then generates a system prompt that combines the user's question with the database schema, which is then fed into the LLM to determine the appropriate data source and generate the necessary SQL query.
- SQL Execution: If the LLM determines that the question can be answered from the structured data, the agent executes the generated SQL query against the Aurora database and returns the results.
Business Impact and Use Cases
The presenters highlight the following key benefits and use cases of this approach:
- Unified Data Access: By bridging structured and unstructured data sources, the AI agent can provide more comprehensive and accurate answers to user questions, drawing from the enterprise's full knowledge base.
- Scalable and Adaptable: The agent-based architecture allows the system to be easily scaled and adapted to new data sources and use cases, making it a powerful tool for enterprises with complex information landscapes.
- Improved User Experience: The chatbot interface and intelligent query processing provide a seamless and intuitive user experience, making it easier for end-users to access the organization's knowledge and data.
Deployment and Implementation
While the presenters did not deploy the application to a production environment, they discussed several options for running the AI agent-based application, including:
- AWS Lambda
- Container services like ECS or EKS
- The new AWS Agent Core runtime
They also emphasized the importance of properly managing sensitive information, such as database credentials, using services like AWS Secrets Manager.
Conclusion
The presented approach demonstrates the power of combining structured and unstructured data access using generative AI agents. By leveraging the Strand SDK and LLMs, enterprises can build intelligent applications that provide users with a unified view of their knowledge and data, leading to improved decision-making, productivity, and customer experiences.