AWS re:Invent 2025 -Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS w/Intel CPUs

AWS re:Invent 2025 - Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS with Intel CPUs

Intel and AWS Partnership

The Intel and AWS partnership dates back to the launch of Amazon EC2 in 2006, spanning over 18 years.

The collaboration runs deep, with Intel providing the hardware infrastructure and software optimization to power AWS services.

Intel has over 400 instances of EC2 running on Intel processors, providing a wide breadth of compute and technology.

Beyond just hardware, Intel also focuses on software optimization and enabling the ecosystem to run AI workloads efficiently on Intel platforms.

Optimizing AI Inference on Intel CPUs

While AI has been heavily focused on training large models, the next phase will be about enterprise-scale inference workloads.

Intel believes inference workloads will grow significantly, and that they can run efficiently on Intel Xeon CPUs, not just GPUs.

The new 8th generation Intel Xeon 6 processor, custom-built for AWS, offers up to 20% performance improvement over previous generations for diverse workloads.

This processor includes advanced matrix extensions (AMX) that accelerate matrix multiplication, enabling faster inference and training.

Flexible and Cost-Effective Inference on EC2

AWS has launched the new EC2 8i instances powered by the custom Intel Xeon 6 processor across the C, R, and M instance families.

These instances offer flexible configurations through the "Flex" variant, allowing customers to optimize network and storage performance based on their workload needs.

The Flex instances provide a cost-effective alternative to standard instances, delivering price-performance benefits.

Sovereign Inference Platforms with Intel and AWS

Customers like Deote are leveraging Intel-powered EC2 instances to build secure, cost-efficient, and scalable inference platforms.

Deote's approach involves running large language models (LLMs) and small language models (SLMs) on CPU-based EC2 instances, achieving up to 56% reduction in infrastructure costs compared to GPU-based instances.

By using compressed models and Intel's software optimizations, Deote was able to achieve near-parity in accuracy compared to uncompressed models running on GPUs.

This enables secure, on-premises inference within customer VPCs, addressing data sovereignty and security concerns.

AI Innovation with Articulate

Articulate is a platform that helps enterprises convert complex data into personalized insights and outcomes.

Articulate's platform leverages a combination of traditional machine learning models, small language models, and large language models, orchestrated by an intelligent model routing system.

The platform comes with a library of domain-specific and task-specific models, optimized to perform better than general-purpose models on specialized use cases.

Articulate's models are also optimized for cost-efficient inference on Intel Xeon processors, providing both performance and cost benefits.

Key Takeaways

Intel and AWS have a long-standing partnership, collaborating to develop custom hardware and software solutions to power enterprise workloads.

The new Intel Xeon 6 processor, custom-built for AWS, offers significant performance improvements for inference and general compute workloads.

Customers can leverage Intel-powered EC2 instances to build secure, cost-efficient, and scalable inference platforms, addressing data sovereignty and security concerns.

Specialized AI platforms like Articulate leverage Intel's hardware and software optimizations to deliver domain-specific and task-specific models that outperform general-purpose models.

The combination of Intel's hardware, software, and ecosystem enables enterprises to unlock the full potential of AI inference at scale.

AWS re:Invent 2025 -Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS w/Intel CPUs

AWS re:Invent 2025 - Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS with Intel CPUs

Intel and AWS Partnership

Optimizing AI Inference on Intel CPUs

Flexible and Cost-Effective Inference on EC2

Sovereign Inference Platforms with Intel and AWS

AI Innovation with Articulate

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 -Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS w/Intel CPUs

AWS re:Invent 2025 - Building Fast, Cost-Efficient, Sovereign Inference Platforms on AWS with Intel CPUs

Intel and AWS Partnership

Optimizing AI Inference on Intel CPUs

Flexible and Cost-Effective Inference on EC2

Sovereign Inference Platforms with Intel and AWS

AI Innovation with Articulate

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.