TalksAWS re:Invent 2025 - Using graphs over your data lake to power generative AI applications (DAT447)
AWS re:Invent 2025 - Using graphs over your data lake to power generative AI applications (DAT447)
Summary of "Using graphs over your data lake to power generative AI applications" (AWS re:Invent 2025)
Leveraging Graphs and Data Lakes
Graphs allow businesses to leverage the relationships in their data to solve problems and ask valuable questions.
Successful companies find ways to use graphs to provide unique and interesting functionality for their customers.
There is a growing demand from customers to improve the accuracy of generative AI applications by leveraging data from their data lakes.
Strategies for Combining Graphs and Data Lakes
1. Reading and Federating Data from the Data Lake
Neptune Analytics supports a unified graph data model that can ingest both RDF and property graph data.
The Neptune Read algorithm allows querying tabular data (CSV, Parquet) stored in S3 using OpenCypher.
This enables federating external data with the existing graph data.
Example: Combining airport metadata from the graph with recent flight data from the data lake.
The data from the data lake can also be used to augment the existing graph by creating new edges.
2. Leveraging Graph Algorithms
Graph algorithms can be used to derive valuable insights from the graph data, such as:
Identifying important entities (e.g. influential researchers, critical airports, fraudulent transactions) using algorithms like PageRank and centrality measures.
The presenters provide examples of implementing PageRank and Closeness Centrality algorithms using OpenCypher.
Neptune Analytics provides a suite of efficient graph algorithms that can be easily applied to the graph data.
Combining Vector Search and Graph Traversals
Vector similarity search is powerful but can be a "black box" in terms of explaining the results.
Combining vector search with explicit graph traversals can provide more transparency and better explanations.
Example: Using a hybrid search approach that leverages a text index, vector similarity, and a knowledge graph.
Graphs can be augmented with vector embeddings, allowing for combined vector and graph-based search.
Example: Using a graph-based approach to identify fraudulent book listings generated by AI, combining graph traversal and vector similarity.
Enhancing Generative AI Applications with Graph-Powered Memory
Agentic applications (e.g. chatbots) often suffer from a "goldfish effect" due to lack of persistent memory.
Incorporating different types of memory (working, short-term, long-term) into agentic applications can improve accuracy and latency.
Example: Using a knowledge graph-based memory system to store and retrieve context for a travel planning agent.
Tools like Strands SDK, MCPServers, and Zep Graffiti provide ways to add graph-powered persistent memory to agentic applications.
Case study: Trend Micro saw a 20% increase in chatbot accuracy and improved user satisfaction by leveraging graph-powered memory.
Key Takeaways
Graphs are a powerful tool for extracting value from data, especially when combined with data lakes.
Strategies like reading/federating data, applying graph algorithms, and combining vector search with graph traversals can unlock new capabilities.
Incorporating graph-powered persistent memory can significantly improve the performance and user experience of agentic applications.
The presenters provide code examples and resources for further exploration of these techniques.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.