AWS re:Invent 2025 - Video sampling & search using ElastiCache & multimodal embeddings (DAT433)

Video Summary: AWS re:Invent 2025 - Video Sampling & Search using ElastiCache & Multimodal Embeddings

Overview

This presentation showcases a video sampling and search application built using AWS services, including ElastiCache (Valkyrie), Bedrock, and various AI/ML tools. The application ingests videos, extracts key frames, generates multimodal embeddings, and stores the data in ElastiCache for efficient vector similarity search.

Vector Similarity Search with Valkyrie

Valkyrie supports vector similarity search on two data types: hashmaps and JSON documents

Users define an index and schema, and changes are immediately reflected in the main database

Indexing is done asynchronously by dedicated worker threads, allowing the main thread to continue serving other requests

Queries are handled immediately by the query engine, with results optionally enriched with data from the main database

Scaling Valkyrie

Ingestion scales by adding more shards, increasing the ingestion rate

Search scales by adding more replicas, increasing the query throughput

Scaling up the instance type also improves search performance by utilizing more CPU cores

Application Architecture

Ingestion Pipeline:

Videos are uploaded to S3
Frames are extracted and analyzed using various AI/ML services (e.g., recognition, transcription, summarization)
Duplicate frames are identified and removed using vector similarity search in Valkyrie
Remaining frames are processed, and their multimodal embeddings are stored in Valkyrie

Search Functionality:

Users can search by text or image
The query is transformed into an embedding using the same foundation models as the ingestion pipeline
Valkyrie is used to perform a nearest neighbor search, and the matching frames are retrieved from S3 and displayed

Technical Details

Bedrock is used to generate text and multimodal embeddings

Valkyrie is integrated using the Glide client library, which supports both hashmap and JSON data types

Valkyrie indexes are created with two vector fields: one for text embeddings and one for multimodal embeddings

Searching is implemented using the Valkyrie FT (Full Text) search API, allowing for both text-based and image-based queries

Business Impact

The application demonstrates how vector similarity search can be used to build powerful multimedia search and analysis tools

By leveraging AWS services like ElastiCache, Bedrock, and various AI/ML offerings, the solution can be easily scaled and integrated into a wide range of media-centric applications

The ability to efficiently search and retrieve relevant video content can have significant impact in areas such as content discovery, video analytics, and personalized recommendations

Examples and Use Cases

The presented application is built on top of the "Guidance for Media Extraction and Dynamic Content Policy Framework" solution available in the AWS Solutions Library

The solution can be applied to various media-related use cases, such as:

Video content search and discovery
Automated video analysis and tagging
Personalized video recommendations
Media asset management and archiving

AWS re:Invent 2025 - Video sampling & search using ElastiCache & multimodal embeddings (DAT433)

Video Summary: AWS re:Invent 2025 - Video Sampling & Search using ElastiCache & Multimodal Embeddings

Overview

Vector Similarity Search with Valkyrie

Scaling Valkyrie

Application Architecture

Technical Details

Business Impact

Examples and Use Cases

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Video sampling & search using ElastiCache & multimodal embeddings (DAT433)

Video Summary: AWS re:Invent 2025 - Video Sampling & Search using ElastiCache & Multimodal Embeddings

Overview

Vector Similarity Search with Valkyrie

Scaling Valkyrie

Application Architecture

Technical Details

Business Impact

Examples and Use Cases

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.