Talks AWS re:Invent 2025 - Accelerate gen AI and ML workloads with AWS storage (STG201) VIDEO
AWS re:Invent 2025 - Accelerate gen AI and ML workloads with AWS storage (STG201) Accelerating Gen AI and ML Workloads with AWS Storage
Improving Productivity with Prompt Engineering
Prompt engineering allows providing examples, context, and constraints to guide large language model (LLM) responses
Can significantly improve productivity by automating tasks like creating PR FAQ documents
Challenges include scaling prompt engineering to handle large volumes of data and multiple data sources
Leveraging Retrieval Augmented Generation (RAG)
RAG uses semantic search to find and return relevant data from a data lake to augment the original prompt
Converts data into vectors to enable efficient semantic search
Allows LLMs to access relevant data without manually loading everything into the prompt
Optimizing RAG with Metadata Filtering
Metadata provides context, lineage, and classification for data
Enables more targeted and effective RAG searches by narrowing the search space
Metadata becomes the "nervous system" for the entire AI operation
Introducing Agents for Complex Tasks
Agents combine LLMs with access to tools like databases, knowledge bases, and APIs
Allows LLMs to reason through multi-step workflows and leverage diverse data sources
Requires managing tool integrations, which is enabled by the Model Context Protocol (MCP)
Updating Models with Labeled Data
Supervised fine-tuning: Provides labeled examples of desired inputs and outputs to guide model updates
Distillation: Uses a larger "teacher" model to generate outputs that are then used to train a smaller "student" model
Alignment: Provides examples of preferred and non-preferred responses to shape model behavior
Continued Pre-training and Training from Scratch
Continued pre-training embeds new knowledge, tone, and relationships into an existing model
Training from scratch is required when the existing model structure is fundamentally different from the desired application
Efficient data loading and checkpointing to storage are critical for the resource-intensive training process
AWS Storage Solutions for ML Workloads
Amazon FSx for Luster and Open ZFS provide high-performance, scalable file storage for research and development
Amazon S3 Express One Zone offers low-latency, high-throughput object storage optimized for ML training
Amazon S3 Mount Point and S3 Connector for PyTorch provide file-like interfaces for efficient data access
Business Impact and Examples
Biotech firm used S3 Vectors for semantic search on 30 million scientific papers, dramatically reducing research timelines
Meta worked with AWS to scale S3 Express One Zone to 140 Tbps of throughput for large-scale model training
Customers can leverage AWS storage services to build scalable, cost-effective AI pipelines for diverse use cases
Your Digital Journey deserves a great story. Build one with us.