TalksAWS re:Invent 2025 - AI agents at the edge: Build for offline, scale in cloud (DEV301)
AWS re:Invent 2025 - AI agents at the edge: Build for offline, scale in cloud (DEV301)
AWS re:Invent 2025 - AI Agents at the Edge: Build for Offline, Scale in Cloud
Introduction
Presentation by Anna, an AWS Developer Advocate, and David, an AWS Community Hero
Topic: Leveraging local Large Language Models (LLMs) and agentic workflows to build AI solutions that can operate offline and scale in the cloud
The Challenge of Downtime
Downtime in industrial operations can cost over $260,000 per hour on average
Many industries are affected by this problem, including mining, manufacturing, oil and gas, energy, government, and retail
The Cloud Dependency Problem
When running AI agents, they typically require a cloud connection to function
If the internet connection is lost, the agents become unusable
Requirements for Offline AI Agents
Agents need to be powered by LLMs that can reason and think locally, without relying on the cloud
Agents need access to local tools and actions to take real-world steps, not just language processing
Agents need to maintain context and memory between sessions, even when disconnected from the cloud
Agents need to be able to sync data and updates back to the cloud when connectivity is restored
The Hybrid Architecture Solution
Using the open-source Strands Agent SDK and the Llama local LLM runtime
Strands Agents handle the reasoning and decision-making using local LLMs
Strands Agents can call local tools and actions to interact with the physical world
Session management and memory are handled locally, even during connectivity loss
When online, the agents can sync data and updates back to the cloud via AWS Bedrock
Strands Agents
Open-source Python and TypeScript SDK for building agentic workflows
Allows easy integration of LLMs, custom tools, memory management, and more
Supports various LLM models, including Amazon, Anthropic, OpenAI, and others
Creating a Local Offline Agent
Import the Strands library and instantiate an agent using a local Llama model
Define custom tools the agent can use to interact with the physical world
Provide a system prompt to define the agent's personality and capabilities
Start the agent and send prompts for it to process using the local tools
Structured Outputs
Agents can return structured data (e.g., JSON) instead of just free-form text
This allows the agent's responses to be easily integrated with external systems
Example: Creating a Pydantic model to define the structure of a work order
Session Management
Strands Agents support persistent memory and session management
Agents can maintain context and memory across multiple interactions, even after restarts
Uses a file-based session manager, but can also leverage cloud-based solutions
Model Context Protocol (MCP) Integration
MCP is a protocol that allows agents to access external data sources and tools
Strands Agents can integrate with MCP servers to leverage pre-built capabilities
Example: Connecting to a local MongoDB database through an MCP server
Live Demo
Demonstration of a Jupyter notebook showcasing the capabilities of Strands Agents
Includes examples of local sensor monitoring, actuator control, structured outputs, session management, and MCP integration
Kiro - AWS's Agentic IDE
Kiro is a new agentic IDE from AWS that can leverage the Strands Agents MCP server
Allows developers to easily integrate up-to-date documentation and SDK information into their agent-based solutions
Key Takeaways
Offline AI agents powered by local LLMs and tools can solve the cloud dependency problem and address critical downtime issues in various industries.
Strands Agents provide a flexible and extensible framework for building these agentic workflows, with support for local reasoning, actions, memory, and cloud synchronization.
Integrating structured outputs and external data sources through MCP servers allows agents to seamlessly communicate with legacy systems and external APIs.
Session management and persistent memory are crucial for maintaining context and continuity, even during connectivity loss or system restarts.
Tools like Kiro, AWS's agentic IDE, can further enhance the development and integration of these agent-based solutions.
Conclusion
The presenters encourage attendees to explore the open-source Strands Agents repository and reach out for help with any challenges or ideas they have for building offline, edge-based AI solutions.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.