AI at production scale: Cloudera’s inference service with NVIDIA (AIM221)

Generative AI in Enterprises

2022 saw an explosion of Generative AI, with the launch of ChatGPT transforming how enterprises and businesses work.

In 2023, enterprises are experimenting and conducting proof-of-concepts to understand how Generative AI can transform their business.

In 2024 and beyond, enterprises are moving Generative AI into production, leveraging it to increase productivity and drive business transformation.

NVIDIA's Contribution

NVIDIA announced NVIDIA NIM, an accelerated runtime for Generative AI, earlier this year.

NVIDIA NIM provides a set of easy-to-use microservices that help enterprises develop and deploy Generative AI applications.

NVIDIA NIM includes pre-built containers and Helm charts, allowing enterprises to deploy their Generative AI applications easily.

NVIDIA NIM is based on industry-standard APIs and supports custom models, enabling deployment on various platforms, including Cloudera on AWS instances.

NVIDIA NIM significantly improves the throughput and accuracy of Generative AI applications, with a 2-5 times increase in throughput compared to running the models without NIM.

Cloudera's AI Inference Service

Cloudera's AI Inference Service is a new service that leverages NVIDIA NIM at its core.

The service provides auto-scaling and high availability, enterprise-grade security and governance, and enhanced monitoring capabilities.

The AI Inference Service is tightly integrated with other Cloudera services, such as the AI Workbench, ML Flow, and the AI Model Registry, allowing seamless development and deployment of Generative AI applications.

The service enables enterprises to keep their data and interactions private, as the models are hosted within the enterprise's own infrastructure, either on-premises or on AWS.

This provides enterprises with more control and customizability over their Generative AI applications, addressing concerns around data exposure and third-party dependence.

Key Benefits of the Cloudera AI Inference Service with NVIDIA

One-click deployment of Generative AI models, significantly reducing time-to-value.

Unified security and governance across all data and applications, ensuring data privacy and compliance.

Single platform for hosting Generative AI models, simplifying management and support.

Tight integration with Cloudera's other AI services, enabling seamless development and deployment.

Demonstration

The presentation includes a demonstration of the Cloudera AI Inference Service, showcasing the ease of deploying and managing Generative AI models within the enterprise environment.

The demonstration also shows how the Cloudera SQL AI Assistant can be integrated with the AI Inference Service, allowing users to generate, edit, explain, optimize, and comment on SQL code using Generative AI capabilities.

AI at production scale: Cloudera’s inference service with NVIDIA (AIM221)

AI Production with Cloudera's Inference Service and NVIDIA

Generative AI in Enterprises

NVIDIA's Contribution

Cloudera's AI Inference Service

Key Benefits of the Cloudera AI Inference Service with NVIDIA

Demonstration

Your Digital Journey deserves a great story.

Build one with us.

Headquarters

Delivery Centre

AI at production scale: Cloudera’s inference service with NVIDIA (AIM221)

AI Production with Cloudera's Inference Service and NVIDIA

Generative AI in Enterprises

NVIDIA's Contribution

Cloudera's AI Inference Service

Key Benefits of the Cloudera AI Inference Service with NVIDIA

Demonstration

Your Digital Journey deserves a great story.

Build one with us.

This website stores cookies on your computer.