TalksAWS re:Invent 2025 - Scaling GenAI on AWS Without Breaking the Bank through FinOps (AIM227)

AWS re:Invent 2025 - Scaling GenAI on AWS Without Breaking the Bank through FinOps (AIM227)

Scaling GenAI on AWS Without Breaking the Bank through FinOps

Overview

  • This session discusses the key trends shaping Generative AI (GenAI), the options and cost considerations when running these workloads on AWS, and how FinOps (Financial Operations) can help turn this spend into measurable business value.
  • The presenters are JJ Sharma, TVM and PHOPS practice lead for KPMG Australia, and Andrew Miji, principal product marketing manager for IBM Cloudability.

The Scale of Generative AI

  • Gartner predicts global spending on generative AI will hit $664 billion in 2025, up more than 75% from 2024.
  • However, less than 30% of AI leaders say their CEOs are happy with the return on this spending, indicating a need for financial discipline.

FinOps Foundation

  • The FinOps Foundation is a community of about 65,000 practicing professionals that provides tools, processes, and best practices for cloud cost optimization and management.
  • The latest FinOps survey shows a significant increase in the priority of AI and ML spend, moving up 4 places compared to the previous year.

Strategic Outlook on Generative AI

  • The "why" of GenAI is its compelling value proposition, such as the ability to do tasks like content creation, code generation, and analysis much faster and more efficiently than humans.
  • The "what" is the economic environment, where GenAI prices are plummeting rapidly, following a trend similar to Moore's Law but at an even faster pace.
  • The "how" is the adoption pattern, which is expected to follow a similar path as the adoption of electricity - starting with point solutions, then moving to tool-level integration, and finally enabling system-level optimization.

FinOps Considerations for Generative AI

  • Cost drivers for GenAI go beyond just the "inference" or token usage costs, and include elements like security, design, observability, support, and maintenance.
  • Many of these cost elements may respond differently as the solutions scale, with some potentially increasing rather than decreasing.

KPMG Workbench Case Study

  • KPMG member firms initially built similar GenAI solutions independently, facing common challenges around deployment, support, and enablement.
  • The KPMG Workbench was created as a globally available platform to enable many applications to leverage GenAI offerings in a provider-agnostic manner.
  • FinOps played a strategic role in engaging stakeholders, understanding the scaling timeline, and pricing the platform to ensure cost-neutrality for users.

GenAI Deployment Options on AWS

  • SaaS APIs (e.g., OpenAI, Anthropic, Cohere) provide instant access to powerful models with minimal setup, but require sharing data with third parties and have limited customization.
  • AWS Managed Services (Amazon Bedrock and Amazon SageMaker) offer more flexibility while offloading operational burden, with pricing based on usage.
  • Self-Managed Infrastructure (EC2) provides maximum flexibility but requires more expertise to manage the entire stack.

FinOps Practices for Generative AI

  1. Inform: Use accounts and tags to directly map costs to teams and business units. For shared resources, leverage consumption-based telemetry data (e.g., tokens) to accurately allocate costs.
  2. Optimize:
    • Choose the right foundation models for the job, right-size instances, and leverage batching to optimize usage.
    • Utilize AWS Savings Plans and Bedrock Provisioned Throughput to optimize rates and get discounts.
  3. Operate:
    • For experimental use cases, focus on cost per experiment or per fine-tuning run.
    • For productivity-boosting use cases, measure cost per user interaction or task assisted.
    • For new, differentiated offerings, tie the unit economics to revenue-generating outcomes or direct business value improvements.

Key Takeaways

  • Generative AI is experiencing rapid growth and adoption, but organizations are struggling to achieve the desired return on investment.
  • FinOps practices are crucial for gaining visibility, predictability, and efficiency in managing the costs of GenAI workloads on AWS.
  • Careful consideration of deployment options, cost drivers, and unit economics can help organizations maximize the business value of their GenAI investments.
  • The KPMG Workbench case study demonstrates the potential for strategic FinOps to enable the scalable and cost-effective deployment of GenAI solutions across an organization.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.