AWS re:Invent 2025 - Designing local Generative AI inference with AWS IoT Greengrass (DEV316)

Designing Local Generative AI Inference with AWS IoT Greengrass

Introduction to Physical AI

Physical AI is the concept of AI systems that can sense the real world, make decisions, and take physical actions to change the environment.

Key aspects of physical AI include:

Responsiveness: AI can react quickly to changes in the environment.
Autonomy: AI can think and act independently, not just follow orders.
Collaboration: AI can understand human intent and work together with humans towards a common goal.

Physical AI combines AI and robotics, moving AI from the digital world into the physical world.

Local vs. Cloud Inference

Local inference runs the AI model directly on the device, while cloud inference sends data to the cloud for processing.

Local inference provides lower latency and can work offline, but has limited compute resources.

Cloud inference can leverage more powerful compute resources, but incurs higher latency due to the round-trip communication.

The presenter demonstrated the latency difference between local and cloud inference using a robot arm example:

Local USB connection had very low latency, allowing synchronized movement.
Cloud inference over an LTE network had 500-600ms of latency, causing noticeable delays in the robot arm movement.

Updating AI Models at the Edge

AI models can become outdated quickly, so the ability to update models is crucial for physical AI systems.

The presenter showed how to use AWS IoT Greengrass to deploy and update AI models on edge devices:

Containerized AI models are built and pushed to Amazon ECR.
IoT Greengrass recipes are used to deploy the container images to edge devices.
This allows easy updating of AI models without rebuilding the entire system.

Key Considerations for Physical AI

Network Connectivity: Edge devices may not have constant cloud connectivity, so solutions need to work with intermittent or limited network access.

Options include using LTE or satellite-based broadband for outdoor connectivity.

Model File Size: AI models can be very large, up to 15GB or more, which presents challenges for deployment and updates.

IoT Greengrass provides options to handle large model files, such as using container images or custom download scripts.

Conclusion

Physical AI combines the power of AI with the ability to interact with the real world, enabling new applications and use cases.

Balancing local and cloud inference, as well as maintaining updatable AI models at the edge, are key challenges addressed by solutions like AWS IoT Greengrass.

By leveraging physical AI and edge computing, organizations can unlock new opportunities to automate and optimize physical processes, improve responsiveness, and enhance human-machine collaboration.

AWS re:Invent 2025 - Designing local Generative AI inference with AWS IoT Greengrass (DEV316)

Designing Local Generative AI Inference with AWS IoT Greengrass

Introduction to Physical AI

Local vs. Cloud Inference

Updating AI Models at the Edge

Key Considerations for Physical AI

Conclusion

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Designing local Generative AI inference with AWS IoT Greengrass (DEV316)

Designing Local Generative AI Inference with AWS IoT Greengrass

Introduction to Physical AI

Local vs. Cloud Inference

Updating AI Models at the Edge

Key Considerations for Physical AI

Conclusion

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.