TalksSearching images through patterns: An AI-powered serverless solution (DEV204)

Searching images through patterns: An AI-powered serverless solution (DEV204)

Here's a detailed summary of the video transcription in markdown format, broken down into sections for better readability:

Current State of Generative Models

Tremendous advancements in the last two years (2023-2024) with new generation of models from companies like Meta, OpenAI, Anthropic, etc.
Newer models have better capabilities, higher quality, and are more cost-effective.
Key advancements:
- Improved GPU technology enabling scaling of models to billions of parameters.
- Advancements in data sets and training techniques, including use of synthetic data and multilingual models.

Key Features of Newer Models

Larger context windows (up to 300,000 tokens)
Multimodality - ability to process and generate content in text, image, audio, and video formats
Improved reasoning and inference capabilities, going beyond simple question answering
Agentic workflow - models becoming intelligent agents capable of interacting with external systems and performing autonomous actions

Text Models vs. Multimodal Models

Text models are designed to ingest and generate text based on patterns in textual data.
Multimodal models can process information from multiple modalities (text, image, audio, video) and integrate visual and auditory context.

Zero-Shot Prompting

Zero-shot prompting allows models to perform tasks immediately without requiring prior examples or task-specific training.
Benefits for business use cases:
- Extracts information without needing previous examples
- Allows faster implementation of new applications and features
- Saves time and cost on data preparation and model training

Visual Examples

Visual question answering
Diagram interpretation
Image captioning
Grounding (identifying object locations in an image)

Customer Use Case

Digital printing company that creates and prints designs on fabrics for garments.
Key challenges:
- Large, unstructured design file repository (few terabytes)
- Manual process for creating mood boards and finding inspiration images
- Reliance on external resources for images, incurring additional costs
Requirements:
- Create a searchable attribute database of design files
- Maintain privacy and security of the design files
- Implement a low-cost, low-maintenance solution

Solution Approach

Pre-processing on-premises: Segmenting and sampling images on local infrastructure to reduce processing costs.
Generative AI model: Using Anthropic's Clover 3 (Haiku) model on AWS Bedrock for attribute extraction, instead of a machine learning model.
Serverless Architecture:
- Uploading image segments to S3
- Processing images and storing attributes in Aurora Serverless database
- Implementing a searchable interface using API Gateway, Lambda, and DynamoDB

Key Takeaways

Rapid development and implementation of the solution using serverless and managed services.
Generative AI models proved effective in extracting accurate attributes from a large, unstructured design file repository.
Pay-as-you-go pricing model and leveraging free tiers helped keep the solution cost-effective.
Constant feedback and a simplified, low-maintenance approach were crucial for the successful implementation.
Newer capabilities like prompt routing in services like Bedrock open up more possibilities for future use cases.
Cost is a significant factor, and intelligent use of cloud infrastructure can help reduce costs drastically.

Your Digital Journey deserves a great story.

Build one with us.

These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.

If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.

Searching images through patterns: An AI-powered serverless solution (DEV204)

Current State of Generative Models

Key Features of Newer Models

Text Models vs. Multimodal Models

Zero-Shot Prompting

Visual Examples

Customer Use Case

Solution Approach

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

Searching images through patterns: An AI-powered serverless solution (DEV204)

Current State of Generative Models

Key Features of Newer Models

Text Models vs. Multimodal Models

Zero-Shot Prompting

Visual Examples

Customer Use Case

Solution Approach

Key Takeaways

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.