TalksAWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)
AWS re:Invent 2025 - Sustainable and cost-efficient generative AI with agentic workflows (AIM333)
Sustainable and Cost-Efficient Generative AI with Agentic Workflows
The Rise of Generative AI and Sustainability Challenges
Generative AI models are becoming increasingly larger and more resource-intensive
Data centers' energy consumption has increased drastically, with forecasts indicating 60% of this demand will be met by burning fossil fuels
This could lead to an additional 215-220 million tons of CO2 emissions annually
AWS Sustainability Initiatives
AWS took the Climate Pledge in 2019 to power operations with 100% renewable energy, which was achieved by 2023
AWS cloud is 4.1x more energy-efficient than on-premises, enabling up to 90% carbon reduction
AWS offers hardware and data center efficiencies, including optimized silicon, low-carbon concrete, and renewable energy
Generative AI Lifecycle Optimization
Problem Framing:
Determine if generative AI is necessary or if a simpler approach would suffice
Use managed services like Amazon Bedrock to leverage AWS's operational efficiencies
Select the appropriate model size and capabilities based on the specific use case
Model Training/Adaptation:
Start with the least resource-intensive approach, such as prompt engineering
Progress to more advanced techniques like retrieval-augmented generation and parameter-efficient fine-tuning
Use managed services like SageMaker for training from scratch, leveraging efficient silicon like Trainium
Model Deployment and Inference:
Deploy models on efficient hardware like Inferentia and Graviton instances
Optimize models through techniques like pruning, quantization, and knowledge distillation
Leverage Bedrock features like prompt caching and intelligent prompt routing to reduce costs
Monitoring and Observability:
Implement monitoring and observability practices throughout the lifecycle
Utilize tools like CloudWatch, SageMaker Profiler, and Nvidia System Management Interface
Agentic AI Systems and Bedrock Agent Core
Agentic AI systems go beyond generative AI, leveraging large language models to achieve specific, goal-oriented tasks
Bedrock Agent Core is a fully managed service that provides the necessary infrastructure and capabilities for building production-grade agentic AI systems:
Agent Core Runtime: Provides session isolation and suspends CPU cycles when waiting for responses, reducing compute costs
Agent Core Gateway: Offers a unified way for agents to access external tools and APIs, with semantic search to optimize context usage
Additional components: Tools, code interpreter, identity management, and observability
Key Takeaways
Generative AI's rapid growth poses significant sustainability challenges, which can be addressed through careful optimization and the use of managed services
AWS provides a range of hardware, software, and operational efficiencies to enable more sustainable and cost-effective generative AI deployments
Agentic AI systems built on Bedrock Agent Core can further improve efficiency and reduce the environmental impact of large language model-based applications
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.