TalksAWS re:Invent 2025 - Manage multicloud Kubernetes at scale feat. Adobe (HMC322)

AWS re:Invent 2025 - Manage multicloud Kubernetes at scale feat. Adobe (HMC322)

Managing Multicloud Kubernetes at Scale: Insights from AWS and Adobe

Multicloud Strategies and Considerations

  • Multicloud refers to running IT solutions or workloads across at least two cloud service providers
  • Customers adopt multicloud strategies for various reasons, such as business needs, mergers and acquisitions, or leveraging differentiated cloud capabilities
  • AWS partners with customers to help them build technical and non-technical capabilities for successful multicloud implementations

Kubernetes as a Multicloud Abstraction

  • Kubernetes has emerged as a fundamental abstraction used by multicloud customers to streamline their experience
  • Key considerations for multicloud Kubernetes fleet management include:
    • Cluster lifecycle management
    • Add-on lifecycle management (networking, security, observability)
    • Handling different cluster configurations based on workload requirements
    • Supporting multiple Kubernetes distributions

GitOps-based Multicloud Kubernetes Management

  • A successful approach for multicloud Kubernetes fleet management is using a GitOps-based approach
  • Customers store cluster and add-on configurations, as well as security, audit, and compliance policies, in Git as the source of truth
  • They then leverage GitOps operators like Argo CD and cloud-provider-specific controllers to continuously provision and reconcile the state of their multicloud Kubernetes fleet

Adobe's Multicloud Kubernetes Journey

Ethos: Adobe's Platform Engineering Approach

  • Adobe's mission is to help developers write better software faster, even in a complex multicloud environment
  • Adobe built "Ethos" - a consistent interface and platform to simplify the developer experience across multiple cloud providers
  • Ethos enables developers to:
    • Integrate and manage cloud-native infrastructure using uniform CI/CD pipelines
    • Focus on delivering business value rather than dealing with cloud-specific security or compliance requirements

Ethos at Scale

  • Ethos orchestrates around 4 million containers daily, with 90% of Adobe's containerized footprint running on it
  • Ethos manages 4,000 distinct services across Adobe's three clouds (Digital Experience, Creative, and Document)
  • This footprint is deployed across 450 independent Ethos clusters globally, spanning 30 regions, multiple public cloud providers, and private data centers

Declarative Infrastructure and GitOps Workflows

  • Adobe's approach is rooted in declarative infrastructure, with cluster configurations, network policies, versions, and resource definitions managed as code in Git
  • They use GitOps tools like Argo CD and Helm to automate the deployment and reconciliation of the desired state
  • This enables features like automated rollbacks, continuous pipelines, and automated tests to ensure consistency between production and the source of truth in Git

Modular and Open-Source Ethos Platform

  • Ethos is built as a modular codebase, using Helm as a package manager to deploy components and security configurations
  • This allows for rapid updates, easier maintenance, and customization for different business needs or technical challenges like multicloud
  • Adobe embraces open-source tools like cert-manager, Cluster API, and OpenTelemetry, enabling their own teams to contribute to the platform

Automating Multicloud Kubernetes Lifecycle

  • Developers provide input on cluster configurations (name, region, cloud provider) through manifest files
  • Argo CD listens to Git signals and uses cloud-provider-specific controllers (e.g., ACK, Cluster API) to provision and manage the desired Kubernetes infrastructure
  • This approach enables a fully automated, GitOps-driven workflow for deploying and managing Kubernetes clusters across multiple cloud providers

Multicloud Kubernetes Decommissioning

  • Adobe has implemented safeguards to prevent accidental or error-prone infrastructure deletions, including:
    • Running tests and input validation
    • Checking for active resources (e.g., PDBs, ingress controllers) before deletion
  • All decommissioning actions are driven through the Git-based workflow, without the need for manual console interactions

Key Takeaways and Lessons Learned

  • Invest in building your team's Kubernetes expertise and platform engineering capabilities
  • Automate as much as possible using GitOps workflows to reduce manual errors and accelerate delivery
  • Leverage the open-source community and ecosystem to build your multicloud Kubernetes platform
  • Treat your platform as a product, gathering feedback from developers and continuously improving it
  • Embrace a "cookie-cutter" approach to enable repeatable, cloud-agnostic deployments, while still accommodating unique business requirements

Impact and Outcomes

  • Adobe's Ethos platform has enabled them to:
    • Deploy changes across their 450-cluster fleet 3 times faster
    • Perform full Kubernetes cluster upgrades 2 times faster
    • Reduce cluster provisioning time by 25%
  • The Ethos platform has been well-received by Adobe's developers, leading to happier teams and better business outcomes

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.