Kevin Klues

Aug 22, 2024

From foundation model to hosted AI solution in minutes

What if you could build a custom AI on your own data with a single API call? Learn how to deploy powerful foundation models in minutes.

#1about 3 minutes

Introducing the IONOS AI Model Hub for easy inference

The IONOS AI Model Hub provides a simple REST API for accessing open-source foundation models and a vector database for RAG.

#2about 1 minute

Exploring the curated open-source foundation models available

The platform offers leading open-source models like Meta Llama 3 for English, Mistral for European languages, and Stable Diffusion XL for image generation.

#3about 7 minutes

How to implement RAG with a single API call

Retrieval-Augmented Generation (RAG) is simplified by abstracting vector database lookups and prompt augmentation into one API request using collection IDs and queries.

#4about 1 minute

Building end-to-end AI solutions in European data centers

Combine the AI Model Hub with IONOS Managed Kubernetes to build and deploy full AI applications within German data centers for data sovereignty.

#5about 3 minutes

Enabling direct GPU access within managed Kubernetes

The NVIDIA GPU Operator will enable direct consumption of GPU resources within IONOS Managed Kubernetes by automatically installing necessary drivers and components.

#6about 3 minutes

Deploying custom inference workloads with NVIDIA NIMs

Use the GPU Operator to request GPUs in a pod spec and deploy NVIDIA Inference Microservices (NIMs) to run custom, containerized AI models on your own infrastructure.

Milly
Vienna, Austria

Intermediate

.NET

TypeScript

+1

Milly
Vienna, Austria

Intermediate

.NET

TypeScript

+1

Saby Company
Delebio, Italy

Remote

Intermediate

Node.js

Highlighting impactful contributions and the rise of open models

01:57 MIN

Highlighting impactful contributions and the rise of open models

Open Source: The Engine of Innovation in the Digital Age

Introducing NVIDIA NIM for simplified LLM deployment

03:30 MIN

Introducing NVIDIA NIM for simplified LLM deployment

Efficient deployment and inference of GPU-accelerated LLMs

Deploying enterprise AI applications with NVIDIA NIM

02:14 MIN

Deploying enterprise AI applications with NVIDIA NIM

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

Deploying and scaling models with NVIDIA NIM on Kubernetes

03:08 MIN

Deploying and scaling models with NVIDIA NIM on Kubernetes

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

How the Linux Foundation supports the end-to-end AI stack

01:53 MIN

How the Linux Foundation supports the end-to-end AI stack

The Open Future of AI: Beyond Open Weights

Overlooked challenges of running AI applications in production

01:30 MIN

Overlooked challenges of running AI applications in production

Chatbots are going to destroy infrastructures and your cloud bills

Accessing software, models, and training resources

01:46 MIN

Accessing software, models, and training resources

Accelerating Python on GPUs

Overview of the NVIDIA AI Enterprise software platform

01:51 MIN

Overview of the NVIDIA AI Enterprise software platform

Efficient deployment and inference of GPU-accelerated LLMs

Featured Partners

Your Next AI Needs 10,000 GPUs. Now What?

Your Next AI Needs 10,000 GPUs. Now What?

Anshul Jindal & Martin Piercy

about 4 months ago • World Congress 2025

Supercharge your cloud-native applications with Generative AI

Supercharge your cloud-native applications with Generative AI

Cedric Clyburn

about a year ago • World Congress 2024

A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes

A Deep Dive on How To Leverage the NVIDIA GB200 for Ultra-Fast Training and Inference on Kubernetes

Kevin Klues

about 4 months ago • World Congress 2025

Efficient deployment and inference of GPU-accelerated LLMs

Efficient deployment and inference of GPU-accelerated LLMs

Adolf Hohl

about a year ago • World Congress 2024

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

WWC24 - Ankit Patel - Unlocking the Future Breakthrough Application Performance and Capabilities with NVIDIA

Ankit Patel

about a year ago • World Congress 2024

Developer Experience, Platform Engineering and AI powered Apps

Developer Experience, Platform Engineering and AI powered Apps

Ignacio Riesgo & Natale Vinto

about a year ago • World Congress 2024

Bringing AI Everywhere

Bringing AI Everywhere

Stephan Gillich

about a year ago • World Congress 2024

Open Source AI, To Foundation Models and Beyond

Open Source AI, To Foundation Models and Beyond

Ankit Patel, Matt White, Philipp Schmid, Lucie-Aimée Kaffee & Andreas Blattmann

about 5 months ago • World Congress 2025

Related Articles

View all articles

CH

Chris Heilmann

Coffee with Developers - Maria Apazoglou - Making AI understandable for all in production

Hello and welcome to another edition of Coffee with Developers. Today, we're excited to share an intriguing conversation with Maria Apazoglou, a leading figure in the AI space at Thomson Reuters. Maria's career journey, insights on AI, and the exciti...

Coffee with Developers - Maria Apazoglou - Making AI understandable for all in production

DC

Daniel Cranney

Stephan Gillich - Bringing AI Everywhere

In the ever-evolving world of technology, AI continues to be the frontier for innovation and transformation. Stephan Gillich, from the AI Center of Excellence at Intel, dove into the subject in a recent session titled "Bringing AI Everywhere," sheddi...

Stephan Gillich - Bringing AI Everywhere

CH

Chris Heilmann

Exploring AI: Opportunities and Risks for Developers

In today's rapidly evolving tech landscape, the integration of Artificial Intelligence (AI) in development presents both exciting opportunities and notable risks. This dynamic was the focus of a recent panel discussion featuring industry experts Kent...

Exploring AI: Opportunities and Risks for Developers

DC

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

From learning to earning

Jobs that call for the skills explored in this talk.

Product Owner/Projektleiter (m/w/d)

relyon AG
Tübingen, Germany

Junior

Intermediate

Senior

Scrum

Data Engineer (f/m/d) - AI

smartclip Europe GmbH
Hamburg, Germany

Intermediate

Senior

ETL

Java

Scala

Machine Learning Engineer (m/f/d)

evoila Frankfurt GmbH
Mainz, Germany

Senior

Keras

DevOps

Tensorflow

Data Platform Engineer (m/w/d) - Remote

evoila Frankfurt GmbH
Mainz, Germany

Intermediate

Senior

Kubernetes

Senior AI Platform Expert Kubernetes GPU/HPC Workloads

BWI GmbH

Senior

Linux

DevOps

Ansible

Terraform

Kubernetes

Backend Developer - Generative AI

INNIO Group

NoSQL

Docker

PyTorch

FastAPI

GraphQL

+3

Full Stack Developer focused on AI Development

SBI GmbH

DevOps

Gitlab

Pandas

Docker

PyTorch

+8

Product Owner generative KI

top itservices AG

Remote

Netzoptimierer KI & Deep Learning

AIDe GmbH Regensburg

€57K

Microsoft Office