Accelerate production for gen AI using Amazon SageMaker MLOps & FMOps (AIM354)

Here is a detailed summary of the key takeaways from the video transcription in markdown format:

Generative AI Adoption and Key Trends

  • Customer spending on generative AI grew by 2.5x in less than a year, from $7 million annually to $18 million.
  • Customers are increasingly relying on multiple model providers for building generative AI applications.
  • Customers are now relying on pre-trained foundation models and customizing their behavior, rather than building their own foundation models.

MLOps, FMOps, and GenOps

  • The foundation of operationalizing any AI/ML workloads is governance that spans across AWS services, data, models, and applications.
  • MLOps builds on top of governance and brings practices across people, process, and technology to help build scalable, repeatable, and reliable workloads.
  • FMOps builds on top of MLOps and brings unique aspects such as selecting and evaluating foundation models, using prompts to modify and manage model behavior, and implementing safeguards.
  • GenOps builds on top of FMOps and brings unique aspects required for building end-to-end generative solutions, such as capabilities to build agents, augment foundation models with secondary sources of information, and trace and observe generative applications in production.

Key Challenges and Recommendations

  1. Fine-tuning Foundation Models: Use tools like SageMaker Ground Truth to provide an easy interface for human feedback and integrate it into automated pipelines.
  2. Experiment and Model Management: Use managed MLflow on SageMaker to track experiments, log metrics, and register models in a centralized model registry.
  3. Prompt Management: Save prompts as templates and data, and combine them with evaluation results for repeatability and traceability.
  4. Building Repeatable Workloads: Use SageMaker Pipelines to build end-to-end model development pipelines that incorporate human feedback and other capabilities.
  5. Evaluation and Monitoring: Incorporate model, data, user feedback, agent, and system metrics to evaluate and monitor generative AI applications.
  6. Deployment and Governance: Use SageMaker Model Registry to manage and track models across environments and ensure compliance and governance.
  7. Implementing Safeguards: Use Bedrock Guard Rails or custom-built safeguards with Lambda Guard to filter out unsafe content and restrict model behavior.
  8. Cost-Effective Deployment: Use techniques like multi-adapter inference endpoints in SageMaker to optimize cost and performance.

Demo Walkthrough

  • Demonstrated reliable experiment tracking using MLflow in SageMaker.
  • Showed how to create an iterative fine-tuning workflow using SageMaker Pipelines.
  • Implemented safeguards by deploying a Lora Guard model in front of the fine-tuned model.

Customer Story: Rocket Mortgage

  • Rocket Mortgage's mission is to make home ownership attainable for everyone by leveraging AI and data.
  • They have invested $500 million and 5 years to build their proprietary platform, Rocket Logic, which enables end-to-end AI capabilities.
  • By adopting a modernized data and ML infrastructure with SageMaker, they were able to reduce development time by 40-60%, scale from a few dozen models to over 200 models in production, and automate 3.7 billion AI and data-driven decisions.
  • They have built solutions like Rocket Assist, a generative AI-powered chatbot, and Rocket Navigator, an internal tool to make the latest generative language models accessible to their team members.

Resources

  • SageMaker MLOps page
  • Generative AI Workshop
  • LinkedIn profiles of the presenters

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us