TalksAWS re:Invent 2025 - [NEW LAUNCH] Amazon Nova 2 Omni: A new frontier in multimodal AI (AIM3324)

AWS re:Invent 2025 - [NEW LAUNCH] Amazon Nova 2 Omni: A new frontier in multimodal AI (AIM3324)

AWS re:Invent 2025 - Amazon Nova 2 Omni: A New Frontier in Multimodal AI

Overview of the Amazon Nova Family of Models

The Amazon Nova family of foundation models was launched at re:Invent 2024, including:
- Nova Understanding models for text, image, and video understanding
- Nova Canvas for image generation
- Nova Real for video generation
- Nova Sonic for real-time conversational AI

Introduction to the Amazon Nova 2 Family

Four new models in the Amazon Nova 2 family were introduced:
- Nova 2 Light: A fast, cost-effective reasoning model for everyday workloads
- Nova 2 Pro: A higher-performance reasoning model for complex tasks
- Nova 2 Omni: A unified multimodal reasoning and image generation model
- Nova 2 Sonic: An improved version of the conversational AI model

Key Capabilities of the Amazon Nova 2 Omni Model

Multimodal understanding and generation: Can process and generate content across text, images, video, and audio
Hybrid reasoning: Developers can control the level of reasoning the model applies
Powerful multimodal perception: State-of-the-art performance on tasks like document understanding, audio understanding, and video understanding
High-quality image generation: Improved text rendering and spatial understanding compared to previous models
Broad language support: Understands over 200 languages, including up to 10 languages for audio/speech

Technical Performance of the Amazon Nova 2 Omni Model

Highly competitive on benchmarks measuring language understanding, reasoning, instruction following, and tool calling
Outperforms other leading models on the Artificial Analysis Index, a consolidated metric across 10+ benchmarks
Achieves state-of-the-art results on document understanding tasks like OCR and key information extraction
Ranked #2 on the MMAU leaderboard for audio understanding and reasoning

Business Applications and Use Cases

Document Understanding

Accurately extracts text, images, and structured information from complex documents
Can identify inconsistencies and perform calculations within the documents

Audio Understanding

Transcribes speech, summarizes audio content, and answers questions about audio files
Supports multi-speaker diarization and multiple languages

Image and Video Understanding

Excels at perception tasks like object detection, scene understanding, and temporal reasoning
Outperforms other models on benchmarks like the Video Benchmark and the new Mavericks benchmark

Image Generation and Editing

Generates high-quality, realistic images from text prompts
Supports a wide range of image editing operations like adding, altering, and replacing objects

Customer Spotlight: Densu Digital's Use Cases

Densu Digital, a leading advertising agency, is using the Amazon Nova 2 Omni model in several ways:
- Ad creative generation, performance prediction, and improvement suggestion
- Automating marketing workflows and agent-based applications
- Connecting in-store and digital experiences through persona-based interactions

Key Takeaways

The Amazon Nova 2 Omni model represents a significant advancement in multimodal AI, with state-of-the-art performance across a wide range of perception, reasoning, and generation tasks.
The model's ability to understand and generate content across text, images, video, and audio enables new classes of applications and workflows that were previously difficult to achieve.
Customers like Densu Digital are already leveraging the power of the Nova 2 Omni model to streamline creative processes, automate marketing operations, and create more immersive customer experiences.
The technical performance and real-world business impact demonstrate the transformative potential of this new frontier in multimodal AI.

Your Digital Journey deserves a great story.

Build one with us.

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

AWS re:Invent 2025 - [NEW LAUNCH] Amazon Nova 2 Omni: A new frontier in multimodal AI (AIM3324)

AWS re:Invent 2025 - Amazon Nova 2 Omni: A New Frontier in Multimodal AI

Overview of the Amazon Nova Family of Models

Introduction to the Amazon Nova 2 Family

Key Capabilities of the Amazon Nova 2 Omni Model

Technical Performance of the Amazon Nova 2 Omni Model

Business Applications and Use Cases

Document Understanding

Audio Understanding

Image and Video Understanding

Image Generation and Editing

Customer Spotlight: Densu Digital's Use Cases

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - [NEW LAUNCH] Amazon Nova 2 Omni: A new frontier in multimodal AI (AIM3324)

AWS re:Invent 2025 - Amazon Nova 2 Omni: A New Frontier in Multimodal AI

Overview of the Amazon Nova Family of Models

Introduction to the Amazon Nova 2 Family

Key Capabilities of the Amazon Nova 2 Omni Model

Technical Performance of the Amazon Nova 2 Omni Model

Business Applications and Use Cases

Document Understanding

Audio Understanding

Image and Video Understanding

Image Generation and Editing

Customer Spotlight: Densu Digital's Use Cases

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.