AWS re:Invent 2025 - Accelerate gen AI and ML workloads with AWS storage (STG201)

Accelerating Gen AI and ML Workloads with AWS Storage

Improving Productivity with Prompt Engineering

Prompt engineering allows providing examples, context, and constraints to guide large language model (LLM) responses

Can significantly improve productivity by automating tasks like creating PR FAQ documents

Challenges include scaling prompt engineering to handle large volumes of data and multiple data sources

Leveraging Retrieval Augmented Generation (RAG)

RAG uses semantic search to find and return relevant data from a data lake to augment the original prompt

Converts data into vectors to enable efficient semantic search

Allows LLMs to access relevant data without manually loading everything into the prompt

Optimizing RAG with Metadata Filtering

Metadata provides context, lineage, and classification for data

Enables more targeted and effective RAG searches by narrowing the search space

Metadata becomes the "nervous system" for the entire AI operation

Introducing Agents for Complex Tasks

Agents combine LLMs with access to tools like databases, knowledge bases, and APIs

Allows LLMs to reason through multi-step workflows and leverage diverse data sources

Requires managing tool integrations, which is enabled by the Model Context Protocol (MCP)

Updating Models with Labeled Data

Supervised fine-tuning: Provides labeled examples of desired inputs and outputs to guide model updates

Distillation: Uses a larger "teacher" model to generate outputs that are then used to train a smaller "student" model

Alignment: Provides examples of preferred and non-preferred responses to shape model behavior

Continued Pre-training and Training from Scratch

Continued pre-training embeds new knowledge, tone, and relationships into an existing model

Training from scratch is required when the existing model structure is fundamentally different from the desired application

Efficient data loading and checkpointing to storage are critical for the resource-intensive training process

AWS Storage Solutions for ML Workloads

Amazon FSx for Luster and Open ZFS provide high-performance, scalable file storage for research and development

Amazon S3 Express One Zone offers low-latency, high-throughput object storage optimized for ML training

Amazon S3 Mount Point and S3 Connector for PyTorch provide file-like interfaces for efficient data access

Business Impact and Examples

Biotech firm used S3 Vectors for semantic search on 30 million scientific papers, dramatically reducing research timelines

Meta worked with AWS to scale S3 Express One Zone to 140 Tbps of throughput for large-scale model training

Customers can leverage AWS storage services to build scalable, cost-effective AI pipelines for diverse use cases

AWS re:Invent 2025 - Accelerate gen AI and ML workloads with AWS storage (STG201)

Accelerating Gen AI and ML Workloads with AWS Storage

Improving Productivity with Prompt Engineering

Leveraging Retrieval Augmented Generation (RAG)

Optimizing RAG with Metadata Filtering

Introducing Agents for Complex Tasks

Updating Models with Labeled Data

Continued Pre-training and Training from Scratch

AWS Storage Solutions for ML Workloads

Business Impact and Examples

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AWS re:Invent 2025 - Accelerate gen AI and ML workloads with AWS storage (STG201)

Accelerating Gen AI and ML Workloads with AWS Storage

Improving Productivity with Prompt Engineering

Leveraging Retrieval Augmented Generation (RAG)

Optimizing RAG with Metadata Filtering

Introducing Agents for Complex Tasks

Updating Models with Labeled Data

Continued Pre-training and Training from Scratch

AWS Storage Solutions for ML Workloads

Business Impact and Examples

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.