TalksAWS re:Invent 2025 - Building Multi-tenant global ML SaaS platform (ISV306)

AWS re:Invent 2025 - Building Multi-tenant global ML SaaS platform (ISV306)

Building a Multi-tenant Global ML SaaS Platform on AWS

Overview of Click and the Click Predict Platform

  • Click is a global leader in data analytics and has expertise in real-time data integration, business intelligence, and AI/ML
  • Click has over 40,000 customers across 100 countries and has a strategic collaboration agreement with AWS
  • Click Predict is a fully managed, multi-tenant global machine learning SaaS platform built entirely on AWS

Challenges with Traditional Forecasting Approaches

  • 70% of forecasting programs fail due to:
    • Looking at data in isolation, rather than considering multiple interacting variables
    • Inability to handle complex, computationally-intensive models like neural networks that require GPU acceleration

Key Features of Click Predict

  • Leverages advanced algorithms like encoder-decoder, perceptron, and recurrent neural networks to handle complex, multivariate forecasting
  • Designed as a multi-tenant, globally distributed platform with the following key components:
    • GPU clusters for model training
    • Regional training models for data sovereignty
    • Scale-to-zero architecture for cost efficiency

Why AWS was the Ideal Platform

  • Global scale and data sovereignty across 38 regions (and growing)
  • GPU scaling and cost control with features like spot instances and scale-to-zero
  • Enterprise-grade reliability, high availability, and disaster recovery

Technical Architecture

  • API Gateway routes requests to the appropriate regional deployment
  • EKS and Fargate provide the scalable, GPU-accelerated compute
  • Other AWS services used include Amazon MQ, S3, RDS, Lambda, and EventBridge for integration

Business Impact and Use Cases

  • 4,500+ active users, 167,000+ trained models, 1.5 billion predictions with 99.99% uptime
  • 9x faster training times with GPUs vs. CPUs
  • Examples:
    • Appalachian Regional Healthcare reduced patient no-shows, saving $6M annually
    • Integra Financial automated lead scoring, saving $1M
    • Village Roadshow improved demand forecasting and staffing decisions

Key Takeaways

  1. Design for multi-tenancy from the ground up, not as an afterthought
  2. Invest in observability and telemetry to provide transparency to customers
  3. Integrate machine learning applications with existing workflows and systems to maximize business value

Conclusion

Click Predict demonstrates how a global, enterprise-ready machine learning SaaS platform can be built on AWS, leveraging the platform's scalability, reliability, and GPU-accelerated compute to deliver tangible business results for customers across industries.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.