Kevin Klues
From foundation model to hosted AI solution in minutes
#1about 3 minutes
Introducing the IONOS AI Model Hub for easy inference
The IONOS AI Model Hub provides a simple REST API for accessing open-source foundation models and a vector database for RAG.
#2about 1 minute
Exploring the curated open-source foundation models available
The platform offers leading open-source models like Meta Llama 3 for English, Mistral for European languages, and Stable Diffusion XL for image generation.
#3about 7 minutes
How to implement RAG with a single API call
Retrieval-Augmented Generation (RAG) is simplified by abstracting vector database lookups and prompt augmentation into one API request using collection IDs and queries.
#4about 1 minute
Building end-to-end AI solutions in European data centers
Combine the AI Model Hub with IONOS Managed Kubernetes to build and deploy full AI applications within German data centers for data sovereignty.
#5about 3 minutes
Enabling direct GPU access within managed Kubernetes
The NVIDIA GPU Operator will enable direct consumption of GPU resources within IONOS Managed Kubernetes by automatically installing necessary drivers and components.
#6about 3 minutes
Deploying custom inference workloads with NVIDIA NIMs
Use the GPU Operator to request GPUs in a pod spec and deploy NVIDIA Inference Microservices (NIMs) to run custom, containerized AI models on your own infrastructure.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
07:36 MIN
Highlighting impactful contributions and the rise of open models
Open Source: The Engine of Innovation in the Digital Age
05:43 MIN
Introducing NVIDIA NIM for simplified LLM deployment
Efficient deployment and inference of GPU-accelerated LLMs
15:54 MIN
Deploying enterprise AI applications with NVIDIA NIM
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
16:17 MIN
Deploying and scaling models with NVIDIA NIM on Kubernetes
LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices
03:35 MIN
How the Linux Foundation supports the end-to-end AI stack
The Open Future of AI: Beyond Open Weights
17:44 MIN
Overlooked challenges of running AI applications in production
Chatbots are going to destroy infrastructures and your cloud bills
20:32 MIN
Accessing software, models, and training resources
Accelerating Python on GPUs
11:09 MIN
Overview of the NVIDIA AI Enterprise software platform
Efficient deployment and inference of GPU-accelerated LLMs
Featured Partners
Related Videos
Your Next AI Needs 10,000 GPUs. Now What?
Anshul Jindal & Martin Piercy
Open Source AI, To Foundation Models and Beyond
Ankit Patel, Matt White, Philipp Schmid, Lucie-Aimée Kaffee & Andreas Blattmann
Supercharge your cloud-native applications with Generative AI
Cedric Clyburn
A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes
Kevin Klues
WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA
Ankit Patel
Efficient deployment and inference of GPU-accelerated LLMs
Adolf Hohl
Developer Experience, Platform Engineering and AI powered Apps
Ignacio Riesgo & Natale Vinto
Bringing AI Everywhere
Stephan Gillich
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.








