AI at production scale: Cloudera’s inference service with NVIDIA (AIM221)
AI Production with Cloudera's Inference Service and NVIDIA
Generative AI in Enterprises
2022 saw an explosion of Generative AI, with the launch of ChatGPT transforming how enterprises and businesses work.
In 2023, enterprises are experimenting and conducting proof-of-concepts to understand how Generative AI can transform their business.
In 2024 and beyond, enterprises are moving Generative AI into production, leveraging it to increase productivity and drive business transformation.
NVIDIA's Contribution
NVIDIA announced NVIDIA NIM, an accelerated runtime for Generative AI, earlier this year.
NVIDIA NIM provides a set of easy-to-use microservices that help enterprises develop and deploy Generative AI applications.
NVIDIA NIM includes pre-built containers and Helm charts, allowing enterprises to deploy their Generative AI applications easily.
NVIDIA NIM is based on industry-standard APIs and supports custom models, enabling deployment on various platforms, including Cloudera on AWS instances.
NVIDIA NIM significantly improves the throughput and accuracy of Generative AI applications, with a 2-5 times increase in throughput compared to running the models without NIM.
Cloudera's AI Inference Service
Cloudera's AI Inference Service is a new service that leverages NVIDIA NIM at its core.
The service provides auto-scaling and high availability, enterprise-grade security and governance, and enhanced monitoring capabilities.
The AI Inference Service is tightly integrated with other Cloudera services, such as the AI Workbench, ML Flow, and the AI Model Registry, allowing seamless development and deployment of Generative AI applications.
The service enables enterprises to keep their data and interactions private, as the models are hosted within the enterprise's own infrastructure, either on-premises or on AWS.
This provides enterprises with more control and customizability over their Generative AI applications, addressing concerns around data exposure and third-party dependence.
Key Benefits of the Cloudera AI Inference Service with NVIDIA
One-click deployment of Generative AI models, significantly reducing time-to-value.
Unified security and governance across all data and applications, ensuring data privacy and compliance.
Single platform for hosting Generative AI models, simplifying management and support.
Tight integration with Cloudera's other AI services, enabling seamless development and deployment.
Demonstration
The presentation includes a demonstration of the Cloudera AI Inference Service, showcasing the ease of deploying and managing Generative AI models within the enterprise environment.
The demonstration also shows how the Cloudera SQL AI Assistant can be integrated with the AI Inference Service, allowing users to generate, edit, explain, optimize, and comment on SQL code using Generative AI capabilities.
These cookies are used to collect information about how you interact with this website and allow us to remember you. We use this information to improve and customize your browsing experience, as well as for analytics.
If you decline, your information won’t be tracked when you visit this website. A single cookie will be used in your browser to remember your preference.