Unlock the power of generative AI with AWS Serverless (SVS319)

Summary of the Video Transcription

Introduction and Overview

  • The session covered how to build applications using generative AI models, exploring various use cases and patterns for model inferencing.
  • The speakers included Uma Ramadoss (Principal Specialist Solutions Architect), Patrick O'Connor (Prototyping Engineer), Martin O'Gorman (Head of Data Science at Parameta Solutions), and Dhiraj Mahapatro (Principal Specialist SA).
  • The session aimed to demonstrate how to combine the speed of serverless and the power of generative AI to rapidly deliver applications and go to market faster.

Generative AI Stack on AWS

  • The foundation of the generative AI stack on AWS includes infrastructure for model training and inferencing, such as GPUs, Tranium, Inferentia chips, and Nitro System.
  • On top of this, Amazon Bedrock provides a serverless service with capabilities around foundation models, agents, and guardrails to build applications.
  • The highest abstraction layer includes services like Amazon Q for business, Q Developer for codegen, QuickSight, and Connect.

Serverless Integration Patterns

  • Serverless APIs on AWS, such as Amazon API Gateway and AWS AppSync, provide a managed and scalable way to interface with foundation models.
  • Serverless integration patterns include using Lambda functions to poll from SQS, EventBridge pushing messages to targets like Step Functions, and Step Functions waiting for human intervention.
  • These patterns can be applied to integrate with Bedrock or SageMaker for generative AI applications.

Use Cases and Patterns

  • The session covered the following use cases and patterns:
    1. Chatbots: Using an asynchronous architecture with SQS and Lambda to handle scalable chatbot scenarios.
    2. Retrieval Augmented Generation (RAG): Utilizing Step Functions and Distributed Map to process large-scale data and create vector embeddings to enhance the conversational ability of generative AI models.
    3. Virtual Assistants: Leveraging Step Functions and "function calling" or "tool use" capabilities to allow large language models to orchestrate actions on the user's behalf.
    4. Document Summarization: Implementing summarization use cases using serverless patterns, including real-time and batch processing approaches.
    5. Content Creation: Applying prompt chaining and human-in-the-loop techniques using Step Functions to improve the accuracy and performance of generative AI applications.

Customer Perspective: Parameta Solutions

  • Parameta Solutions, a data division of TP ICAP Group, shared their journey in adopting serverless and generative AI technologies.
  • They showcased use cases like automating the triage of client support tickets, leveraging serverless and generative AI to unlock siloed data, and driving innovation, digital transformation, and cost optimization.

Key Takeaways

  • Combining serverless and generative AI enables building evolutionary, scalable, and flexible applications that can easily adapt to new models and use cases.
  • Serverless integration patterns, such as using SQS, Step Functions, and Distributed Map, can help handle the scalability and reliability challenges of working with generative AI models.
  • Incorporating human-in-the-loop processes and prompt engineering strategies can improve the accuracy and performance of generative AI applications.
  • The session provided various resources, including sample applications, blogs, and learning materials, to help attendees further explore and experiment with serverless and generative AI technologies.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us