TalksAWS re:Invent 2025 - AI agents at the edge: Build for offline, scale in cloud (DEV301)

AWS re:Invent 2025 - AI agents at the edge: Build for offline, scale in cloud (DEV301)

AWS re:Invent 2025 - AI Agents at the Edge: Build for Offline, Scale in Cloud

Introduction

  • Presentation by Anna, an AWS Developer Advocate, and David, an AWS Community Hero
  • Topic: Leveraging local Large Language Models (LLMs) and agentic workflows to build AI solutions that can operate offline and scale in the cloud

The Challenge of Downtime

  • Downtime in industrial operations can cost over $260,000 per hour on average
  • Many industries are affected by this problem, including mining, manufacturing, oil and gas, energy, government, and retail

The Cloud Dependency Problem

  • When running AI agents, they typically require a cloud connection to function
  • If the internet connection is lost, the agents become unusable

Requirements for Offline AI Agents

  1. Agents need to be powered by LLMs that can reason and think locally, without relying on the cloud
  2. Agents need access to local tools and actions to take real-world steps, not just language processing
  3. Agents need to maintain context and memory between sessions, even when disconnected from the cloud
  4. Agents need to be able to sync data and updates back to the cloud when connectivity is restored

The Hybrid Architecture Solution

  • Using the open-source Strands Agent SDK and the Llama local LLM runtime
  • Strands Agents handle the reasoning and decision-making using local LLMs
  • Strands Agents can call local tools and actions to interact with the physical world
  • Session management and memory are handled locally, even during connectivity loss
  • When online, the agents can sync data and updates back to the cloud via AWS Bedrock

Strands Agents

  • Open-source Python and TypeScript SDK for building agentic workflows
  • Allows easy integration of LLMs, custom tools, memory management, and more
  • Supports various LLM models, including Amazon, Anthropic, OpenAI, and others

Creating a Local Offline Agent

  1. Import the Strands library and instantiate an agent using a local Llama model
  2. Define custom tools the agent can use to interact with the physical world
  3. Provide a system prompt to define the agent's personality and capabilities
  4. Start the agent and send prompts for it to process using the local tools

Structured Outputs

  • Agents can return structured data (e.g., JSON) instead of just free-form text
  • This allows the agent's responses to be easily integrated with external systems
  • Example: Creating a Pydantic model to define the structure of a work order

Session Management

  • Strands Agents support persistent memory and session management
  • Agents can maintain context and memory across multiple interactions, even after restarts
  • Uses a file-based session manager, but can also leverage cloud-based solutions

Model Context Protocol (MCP) Integration

  • MCP is a protocol that allows agents to access external data sources and tools
  • Strands Agents can integrate with MCP servers to leverage pre-built capabilities
  • Example: Connecting to a local MongoDB database through an MCP server

Live Demo

  • Demonstration of a Jupyter notebook showcasing the capabilities of Strands Agents
  • Includes examples of local sensor monitoring, actuator control, structured outputs, session management, and MCP integration

Kiro - AWS's Agentic IDE

  • Kiro is a new agentic IDE from AWS that can leverage the Strands Agents MCP server
  • Allows developers to easily integrate up-to-date documentation and SDK information into their agent-based solutions

Key Takeaways

  1. Offline AI agents powered by local LLMs and tools can solve the cloud dependency problem and address critical downtime issues in various industries.
  2. Strands Agents provide a flexible and extensible framework for building these agentic workflows, with support for local reasoning, actions, memory, and cloud synchronization.
  3. Integrating structured outputs and external data sources through MCP servers allows agents to seamlessly communicate with legacy systems and external APIs.
  4. Session management and persistent memory are crucial for maintaining context and continuity, even during connectivity loss or system restarts.
  5. Tools like Kiro, AWS's agentic IDE, can further enhance the development and integration of these agent-based solutions.

Conclusion

  • The presenters encourage attendees to explore the open-source Strands Agents repository and reach out for help with any challenges or ideas they have for building offline, edge-based AI solutions.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.