Scale your high-traffic events to gen AI deployments with AWS Support (SUP308)

Introduction

The session covers how to scale high-traffic events to generative AI deployments with AWS support.

The presenters introduce themselves - Neil Sandes, a Principal Technical Account Manager at AWS, and MK, a Technical Account Manager at AWS. They are also joined by Manish Sinha, the Senior Director of Advanced Analytics and AI at Georgia Pacific.

Common Challenges in Migrating Prototypes to Production

50% of all migration and modernization initiatives on the cloud will be delayed by at least 2 years, as reported by Gartner.

The key reasons are:

Underestimating the effort to take prototypes and scale them into production-ready deployments.
Unprepared downtime and revenue loss (up to $100,000 per hour of downtime).
Constantly working on reactive support mode, even after scaling the workloads, due to technical debt.

The five common challenges discussed are:

Integration with existing infrastructure
Monitoring and observability
Security
Performance
Cost management

Additional Considerations for Generative AI Workloads

Generative AI brings additional complexities, such as:

Data preparation at scale:
- Centralized data management
- Cleansing and validating data
- Sourcing ground truth data
- Evaluating model outputs
MLOps:
- Continuous training, deployment, and version control of models
- Prompt consistency and library management
- Real-time error monitoring and reaction
Security and governance:
- Preventing training data poisoning and prompt injection
- Securing and validating model outputs
Cost management:
- Implementing financial controls
- Experimenting with smaller models
- Optimizing prompt engineering

AWS Well-Architected Framework

The Well-Architected Framework is a comprehensive guide to build secure, fault-tolerant, resilient, and efficient cloud infrastructure.

It provides design principles, best practices, and questions to assess the current architecture across six pillars: security, reliability, performance efficiency, cost optimization, operational excellence, and sustainability.

AWS Offerings to Support Generative AI Workloads

AWS Countdown

AWS Countdown is offered in two flavors: Standard and Premium.

Countdown Standard helps anticipate capacity needs and work with service teams to approve resource requests.

Countdown Premium is an engineering-led offering that supports the entire journey from initial architecture to production deployment.

Reference Use Case: Fashion Retailer Product Description Generation

The architecture involves a front-end web application, a serverless backend using AWS services (Lambda, API Gateway, DynamoDB, etc.), and a Step Functions workflow to generate product descriptions.

Key design decisions include:

Choosing the right Generative AI model (Bedrock)
Structuring the data strategy for input and output
Implementing security best practices
Enabling logging and cost optimization
Scaling the solution

Georgia Pacific's Generative AI Journey

Georgia Pacific is a large manufacturing company with 30-35,000 employees and $22 billion in revenue.

The key drivers for their AI and Generative AI initiatives are:

Labor scarcity and the need to transfer knowledge to the next generation of workers
Automating undesirable and repetitive tasks in their manufacturing operations
Improving overall equipment effectiveness (OEE) and reducing operating envelope gaps

The Operator Assistant use case:

Combines real-time sensor data with unstructured data (procedures, manuals) to provide prescriptive guidance to operators.
Went through an iterative process of deployment, feedback, and re-engineering to arrive at a scalable, production-ready solution.
Leveraged AWS Countdown Premium to fast-track the journey and optimize the architecture.

Key learnings and next steps:

Parameterize the solution and use standard databases for easy maintainability.
Optimize costs and performance, explore graph-based Retrieval Augmented Generation (RAG).
Investigate edge deployments for faster response times.
Aim for rapid deployment and iteration, while building a robust, long-term architecture.

Scale your high-traffic events to gen AI deployments with AWS Support (SUP308)

Introduction

Common Challenges in Migrating Prototypes to Production

Additional Considerations for Generative AI Workloads

AWS Well-Architected Framework

AWS Offerings to Support Generative AI Workloads

AWS Countdown

Reference Use Case: Fashion Retailer Product Description Generation

Georgia Pacific's Generative AI Journey

Additional Resources

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

Scale your high-traffic events to gen AI deployments with AWS Support (SUP308)

Introduction

Common Challenges in Migrating Prototypes to Production

Additional Considerations for Generative AI Workloads

AWS Well-Architected Framework

AWS Offerings to Support Generative AI Workloads

AWS Countdown

Reference Use Case: Fashion Retailer Product Description Generation

Georgia Pacific's Generative AI Journey

Additional Resources

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.