TalksAWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)

AWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)

Sustainable and Cost-Efficient Generative AI with Agentic Workflows

The Rise of Generative AI and Sustainability Challenges

  • Generative AI models are becoming increasingly larger and more resource-intensive
  • Data centers' energy consumption has increased drastically, with forecasts indicating 60% of this demand will be met by burning fossil fuels
  • This could lead to an additional 215-220 million tons of CO2 emissions annually

AWS Sustainability Initiatives

  • AWS took the Climate Pledge in 2019 to power operations with 100% renewable energy, which was achieved by 2023
  • AWS cloud is 4.1x more energy-efficient than on-premises, enabling up to 90% carbon reduction
  • AWS offers hardware and data center efficiencies, including optimized silicon, low-carbon concrete, and renewable energy

Generative AI Lifecycle Optimization

  1. Problem Framing:

    • Determine if generative AI is necessary or if a simpler approach would suffice
    • Use managed services like Amazon Bedrock to leverage AWS's operational efficiencies
    • Select the appropriate model size and capabilities based on the specific use case
  2. Model Training/Adaptation:

    • Start with the least resource-intensive approach, such as prompt engineering
    • Progress to more advanced techniques like retrieval-augmented generation and parameter-efficient fine-tuning
    • Use managed services like SageMaker for training from scratch, leveraging efficient silicon like Trainium
  3. Model Deployment and Inference:

    • Deploy models on efficient hardware like Inferentia and Graviton instances
    • Optimize models through techniques like pruning, quantization, and knowledge distillation
    • Leverage Bedrock features like prompt caching and intelligent prompt routing to reduce costs
  4. Monitoring and Observability:

    • Implement monitoring and observability practices throughout the lifecycle
    • Utilize tools like CloudWatch, SageMaker Profiler, and Nvidia System Management Interface

Agentic AI Systems and Bedrock Agent Core

  • Agentic AI systems go beyond generative AI, leveraging large language models to achieve specific, goal-oriented tasks
  • Bedrock Agent Core is a fully managed service that provides the necessary infrastructure and capabilities for building production-grade agentic AI systems:
    • Agent Core Runtime: Provides session isolation and suspends CPU cycles when waiting for responses, reducing compute costs
    • Agent Core Gateway: Offers a unified way for agents to access external tools and APIs, with semantic search to optimize context usage
    • Additional components: Tools, code interpreter, identity management, and observability

Key Takeaways

  • Generative AI's rapid growth poses significant sustainability challenges, which can be addressed through careful optimization and the use of managed services
  • AWS provides a range of hardware, software, and operational efficiencies to enable more sustainable and cost-effective generative AI deployments
  • Agentic AI systems built on Bedrock Agent Core can further improve efficiency and reduce the environmental impact of large language model-based applications

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.