AWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)

Sustainable and Cost-Efficient Generative AI with Agentic Workflows

The Rise of Generative AI and Sustainability Challenges

Generative AI models are becoming increasingly larger and more resource-intensive

Data centers' energy consumption has increased drastically, with forecasts indicating 60% of this demand will be met by burning fossil fuels

This could lead to an additional 215-220 million tons of CO2 emissions annually

AWS Sustainability Initiatives

AWS took the Climate Pledge in 2019 to power operations with 100% renewable energy, which was achieved by 2023

AWS cloud is 4.1x more energy-efficient than on-premises, enabling up to 90% carbon reduction

AWS offers hardware and data center efficiencies, including optimized silicon, low-carbon concrete, and renewable energy

Generative AI Lifecycle Optimization

Problem Framing:

Determine if generative AI is necessary or if a simpler approach would suffice
Use managed services like Amazon Bedrock to leverage AWS's operational efficiencies
Select the appropriate model size and capabilities based on the specific use case

Model Training/Adaptation:

Start with the least resource-intensive approach, such as prompt engineering
Progress to more advanced techniques like retrieval-augmented generation and parameter-efficient fine-tuning
Use managed services like SageMaker for training from scratch, leveraging efficient silicon like Trainium

Model Deployment and Inference:

Deploy models on efficient hardware like Inferentia and Graviton instances
Optimize models through techniques like pruning, quantization, and knowledge distillation
Leverage Bedrock features like prompt caching and intelligent prompt routing to reduce costs

Monitoring and Observability:

Implement monitoring and observability practices throughout the lifecycle
Utilize tools like CloudWatch, SageMaker Profiler, and Nvidia System Management Interface

Agentic AI Systems and Bedrock Agent Core

Agentic AI systems go beyond generative AI, leveraging large language models to achieve specific, goal-oriented tasks

Bedrock Agent Core is a fully managed service that provides the necessary infrastructure and capabilities for building production-grade agentic AI systems:

Agent Core Runtime: Provides session isolation and suspends CPU cycles when waiting for responses, reducing compute costs
Agent Core Gateway: Offers a unified way for agents to access external tools and APIs, with semantic search to optimize context usage
Additional components: Tools, code interpreter, identity management, and observability

Key Takeaways

Generative AI's rapid growth poses significant sustainability challenges, which can be addressed through careful optimization and the use of managed services

AWS provides a range of hardware, software, and operational efficiencies to enable more sustainable and cost-effective generative AI deployments

Agentic AI systems built on Bedrock Agent Core can further improve efficiency and reduce the environmental impact of large language model-based applications

AWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)

Sustainable and Cost-Efficient Generative AI with Agentic Workflows

The Rise of Generative AI and Sustainability Challenges

AWS Sustainability Initiatives

Generative AI Lifecycle Optimization

Agentic AI Systems and Bedrock Agent Core

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)

Sustainable and Cost-Efficient Generative AI with Agentic Workflows

The Rise of Generative AI and Sustainability Challenges

AWS Sustainability Initiatives

Generative AI Lifecycle Optimization

Agentic AI Systems and Bedrock Agent Core

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.