TalksAWS re:Invent 2025 - Optimizing generative AI workloads for sustainability and cost (AIM253)

AWS re:Invent 2025 - Optimizing generative AI workloads for sustainability and cost (AIM253)

Optimizing Generative AI Workloads for Sustainability and Cost

The Growing Environmental Impact of Generative AI

  • In the past decade, data center electricity demand and energy consumption have remained relatively stable at around 100 terawatt-hours.
  • However, the rise of generative AI in 2021 has led to a drastic increase in energy consumption and resource usage.
  • The model training size has grown about 350,000 times over the last 10 years.
  • Goldman Sachs research forecasts that 60% of this increasing electricity demand will be met by burning fossil fuels, resulting in 215-220 million tons of carbon dioxide emissions annually.
  • The computational power for these workloads is doubling every 100 days, further exacerbating the environmental crisis.

Sustainability Initiatives at AWS

  • In 2019, AWS took the Climate Pledge, committing to power its operations with 100% renewable energy, which was achieved in 2023.
  • AWS is 4.1 times more energy-efficient than on-premises solutions, thanks to various optimizations:
    • 100% renewable energy powering operations
    • Hardware efficiencies
    • Managed services with built-in sustainability optimizations
    • Efficient cooling and use of low-carbon concrete in data centers
    • Optimized silicon for model training and inference

Recommendations for Building Sustainable Generative AI Workloads

1. Use Managed Services

  • Managed services like AWS Bedrock, SageMaker, and EKS shift the responsibility of high utilization and sustainability optimization to AWS.
  • This allows developers to focus on building their workloads without the "undifferentiated heavy lifting" of infrastructure management.

2. Select the Right Model

  • When choosing a model from a service like AWS Bedrock, consider factors like use case, business outcome, language requirements, and domain specificity.
  • Smaller, more specialized models may be more suitable than the largest, most prominent models.
  • AWS Bedrock Evaluations can help assess and compare different models based on metrics like correctness, completeness, and potential harm.

3. Customize Models Efficiently

  • Start with prompt engineering, then progress to techniques like retrieval-augmented generation, parameter-efficient fine-tuning, and full fine-tuning, as needed.
  • Avoid training from scratch unless absolutely necessary, as it is the most resource-intensive and carbon-intensive approach.

4. Choose Optimized Silicon

  • Use EC2 instances and silicon like the Trainium family, which are 25-300% more energy-efficient than comparable options.
  • For inference, consider non-GPU instances like Graviton, which are 60% more energy-efficient than traditional EC2 instances.

5. Optimize Inference Deployments

  • Leverage libraries like DeepSpeed, Hugging Face Accelerate, and Faster Transformer to compress models, optimize memory usage, and distribute processing efficiently.
  • Techniques like model pruning, distillation, and quantization can further reduce model size and resource consumption.

6. Implement Continuous Monitoring and Optimization

  • Use tools like CloudWatch, SageMaker Profiler, and NVIDIA Systems Management Interface to monitor model performance, data drift, and potential biases.
  • Continuously optimize the entire generative AI lifecycle to improve sustainability and cost-effectiveness.

Key Takeaways

  • Generative AI is contributing to a significant increase in energy consumption and carbon emissions, which must be addressed.
  • AWS provides a range of services and optimizations to help build sustainable and cost-effective generative AI workloads.
  • By leveraging managed services, selecting the right models, customizing efficiently, choosing optimized hardware, and implementing continuous monitoring, organizations can significantly reduce the environmental impact of their generative AI initiatives.
  • Responsible AI practices should be integrated throughout the entire lifecycle to ensure ethical and sustainable deployment of these powerful technologies.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.