Build an AI gateway for Amazon Bedrock with AWS AppSync (FWM310)

Building an AI Gateway for Amazon Bedrock with AWS AppSync

Introduction

  • Generative AI is becoming increasingly prevalent, with more customers adopting the technology and looking to leverage it.
  • Customers face several challenges when trying to connect their applications to their generative AI backend, such as:
    • Connecting to their backend securely and with flexible access control
    • Handling flexible request patterns, including synchronous and long-running invocations
    • Maintaining flexibility in model selection and protecting their intellectual property

AWS AppSync as an AI Gateway

  • AWS AppSync has evolved into an AI gateway for many customers, addressing the challenges mentioned above.
  • AppSync allows you to:
    • Connect securely and converse with your Amazon Bedrock backend
    • Provide real-time solutions for long-running workloads using its native WebSocket offering over subscriptions
    • Maintain control over your users and model access
    • Create objects that get you the data you want by using GraphQL's built-in type capabilities
    • Decouple your apps from your models, allowing your models to grow at their own pace

Connecting to Amazon Bedrock with AppSync

  • AppSync provides several ways to integrate with your generative AI backend, including:
    • Using an HTTP resolver to directly interact with your models
    • Leveraging the new integration with the Bedrock runtime for simpler integration with micro-invocations
    • Utilizing the invokeModel or Converse API to make requests to your models

Enterprise Integrations using AppSync

Content Analysis Use Case

  • Enterprises often use collaborative applications to interact with employees, documents, and backends.
  • AppSync can be used to streamline real-time updates and enable content analysis using large language models.
  • The solution integrates with Microsoft productivity applications, allowing users to leverage the AI capabilities directly within their tools.

Contact Center Use Case

  • Amazon Connect has limitations, such as an 8-second timeout for real-time updates.
  • By leveraging AppSync's real-time capabilities, the team was able to create a solution that pushes updates to customers in real-time, even for long-running workflows.
  • The solution uses Amazon Connect to capture the user's intent, triggers a backend workflow, and sends updates back to the customer through AppSync subscriptions.

Common Challenges when Building on Large Language Models (LLMs)

  1. Selecting the Right Model: Understand the various models available (e.g., Anthropic's Claude 3.5 Sonnet, Claude 3 Opus, Claude 3.5 Haiku) and their strengths to choose the most suitable one for your use case.

  2. Prompt Engineering: Leverage prompting techniques such as being clear and direct, using a structured prompt, leveraging examples, and implementing chain of thought to improve the quality of the model's outputs.

  3. Handling Latency: Use techniques like streaming outputs, leveraging faster models, reducing the number of output tokens, and prompt caching to optimize latency.

  4. Generating Structured Data: Provide clear instructions, use examples, and explore advanced techniques like tool use to get consistent, high-quality structured data from the LLM.

  5. Evaluating LLM Features: Utilize a combination of evaluation methods, including vibes, human-based evaluation, A/B testing, programmatic checks, and LLM-as-judge, to quickly and confidently iterate on your LLM-powered products.

Your Digital Journey deserves a great story.

Build one with us.

Cookies Icon

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Talk to us