TalksAWS re:Invent 2025 - Optimize for AWS with intelligent automation (AIM235)

AWS re:Invent 2025 - Optimize for AWS with intelligent automation (AIM235)

Optimizing AWS with Intelligent Automation for Unpredictable Agentic AI Workloads

The Challenge of Agentic AI Workloads

  • Agentic AI and AI applications are becoming increasingly prevalent, but they present unique challenges in terms of resource usage and performance.
  • These workloads are highly unpredictable, often spawning hundreds of micro-tasks and resource spikes that can overwhelm traditional scaling approaches.
  • This leads to over-provisioning of expensive resources like GPUs to try to maintain performance, resulting in significant idle capacity and wasted costs.
  • Traditional observability and FinOps tools provide visibility, but lack the ability to take real-time, automated action to optimize these dynamic workloads.

Transforming Agentic AI Workloads with Intelligent Automation

  • Turbonomic provides real-time optimization of Agentic AI workloads by continuously analyzing metrics like GPU utilization, vCPU saturation, throughput, and more.
  • It uses this data to make automated, intelligent decisions to right-size resources in real-time, scaling up when demand spikes and scaling down when utilization is low.
  • This approach maintains performance by ensuring workloads have the resources they need, while also driving significant cost savings by eliminating idle capacity.
  • Turbonomic integrates across the full technology stack, including EC2, EKS, and hybrid environments, to optimize the entire supply chain supporting Agentic AI applications.

Proven Results for Agentic AI Workloads

  • Case study: Turbonomic helped a large AI models team (supporting the LLM behind Watson X) improve their environment:
    • 5.3x reduction in idle GPU resources, freeing up 16 GPUs for other workloads
    • 2x throughput improvement without impacting latency
    • 13 fewer GPUs required, with the freed-up GPUs allocated to new AI workloads

Key Takeaways

  1. Agentic AI workloads are highly dynamic and unpredictable, requiring a new approach beyond just observability.
  2. Intelligent automation that can continuously optimize resources in real-time is critical for managing these workloads effectively.
  3. Targeting a high-value Agentic AI workload as a pilot can demonstrate the benefits of this approach and lead to broader adoption.
  4. Turbonomic provides a comprehensive solution that integrates across the full technology stack to optimize Agentic AI workloads for both performance and cost.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.