TalksAWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Summary of AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Overview of AI Pioneers

AI pioneers are organizations using AI to build transformative, customer-facing architectures and use cases
They often work closer to the AI infrastructure, as model builders, customizers, or those running large language model (LLM) inference at scale
Common architectures include building LLM serving platforms, creative content generation, and domain-specific vision-language models

Building LLM Serving Platforms

Challenges of Scaling LLM Inference

LLM models are extremely large, often requiring multiple GPUs to serve
They have "thinking budgets" that are consumed during runtime, unlike traditional workloads
Request/response patterns are highly variable, from single words to multi-page outputs

Key Requirements of an LLM Serving Platform

Model Choice: Fast access to a variety of foundation models for building generative AI applications
Supporting Services: Vector databases, security, observability, and other requirements for LLM-powered apps
SaaS Capabilities: Rate limiting, cost attribution, usage reporting for LLM-as-a-service offerings
Self-Managed Models: Ability to host and fine-tune custom models on accelerated infrastructure
Deployment Flexibility: Option to deploy LLM inference on-premises or in the cloud

Architecture Patterns

Managed Architecture: Uses Amazon Bedrock to provide model choice and supporting services
SaaS Architecture: Adds an LLM gateway to provide rate limiting, cost controls, and other SaaS capabilities
Hybrid Architecture: Leverages Amazon SageMaker HyperPod to host and fine-tune custom LLM models
Multi-Cloud/On-Premises Architecture: Extends the hybrid architecture to enable deployment on customer-managed infrastructure using EKS Hybrid

Creative Content Generation

Challenges of Consistent Visual Generation

Generic AI models cannot capture the specific visual traits, characters, and consistency required for production-level content generation
Need to maintain visual fidelity and character essence across multiple images/scenes

Bedrock Fine-Tuning for Customized Models

Uses techniques like parameter-efficient fine-tuning (PET), distillation, and continued pre-training (CPT) to customize the Nova Canvas model
Requires curated dataset, image captioning, and human-in-the-loop evaluation to ensure consistent, high-quality outputs

Architecture for LLM-Based Evaluation

Automated video processing and character extraction to create fine-tuning dataset
Bedrock fine-tuning to generate customized model
LLM-based "judge" evaluation to assess visual consistency, prompt adherence, and other criteria at scale

Arabic Vision-Language Model for Document Processing

Misraji AI, a pioneer lab in Saudi Arabia, developed an Arabic-specific vision-language model for OCR and document processing use cases
Leveraged a hybrid approach of real-world and synthetic data, along with iterative fine-tuning strategies, to create a state-of-the-art model
Enabled highly accurate Arabic OCR, competing with top models in the market

Emerging Architecture: Intelligent Control and Operations Plane (ICOP)

Provides a specialized, provider-hosted API endpoint for deploying and managing AI workloads like LLM serving
Understands the workload requirements, plans the optimal deployment, handles the provisioning, and monitors the infrastructure
Leverages customized, task-specific language models rather than general-purpose assistants to enable fast, cost-effective, and reliable AI workload management

Key Takeaways

AI pioneers are pushing the boundaries of generative AI, building transformative customer-facing applications
Scaling LLM inference requires specialized platforms that address model choice, SaaS capabilities, self-managed models, and deployment flexibility
Customizing AI models, like image generation, is crucial for maintaining visual consistency and brand identity
Domain-specific vision-language models can unlock new capabilities, like state-of-the-art Arabic OCR
Emerging "Intelligent Control and Operations Plane" architectures aim to simplify the deployment and management of AI workloads

Your Digital Journey deserves a great story.

Build one with us.

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Summary of AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Overview of AI Pioneers

Building LLM Serving Platforms

Challenges of Scaling LLM Inference

Key Requirements of an LLM Serving Platform

Architecture Patterns

Creative Content Generation

Challenges of Consistent Visual Generation

Bedrock Fine-Tuning for Customized Models

Architecture for LLM-Based Evaluation

Arabic Vision-Language Model for Document Processing

Emerging Architecture: Intelligent Control and Operations Plane (ICOP)

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Summary of AWS re:Invent 2025 - AI Pioneers: Shipping Transformative GenAI Architectures to Production (SMB301)

Overview of AI Pioneers

Building LLM Serving Platforms

Challenges of Scaling LLM Inference

Key Requirements of an LLM Serving Platform

Architecture Patterns

Creative Content Generation

Challenges of Consistent Visual Generation

Bedrock Fine-Tuning for Customized Models

Architecture for LLM-Based Evaluation

Arabic Vision-Language Model for Document Processing

Emerging Architecture: Intelligent Control and Operations Plane (ICOP)

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.