Sergio Perez & Harshita Seth

Aug 20, 2025 • World Congress 2025

Adding knowledge to open-source LLMs

When is retrieval-augmented generation not enough? Learn the multi-stage process for deeply embedding new knowledge into an open-source LLM.

#1about 4 minutes

Understanding the LLM training pipeline and knowledge gaps

LLMs are trained through pre-training and alignment, but require new knowledge to stay current, adapt to specific domains, and acquire new skills.

#2about 5 minutes

Adding domain knowledge with continued pre-training

Continued pre-training adapts a foundation model to a specific domain by training it further on specialized, unlabeled data using self-supervised learning.

#3about 6 minutes

Developing skills and reasoning with supervised fine-tuning

Supervised fine-tuning uses instruction-based datasets to teach models specific tasks, chat capabilities, and complex reasoning through techniques like chain of thought.

#4about 8 minutes

Aligning models with human preferences using reinforcement learning

Preference alignment refines model behavior using reinforcement learning, evolving from complex RLHF with reward models to simpler methods like DPO.

#5about 2 minutes

Using frameworks like NeMo RL to simplify model alignment

Frameworks like the open-source NeMo RL abstract away the complexity of implementing advanced alignment algorithms like reinforcement learning.

Milly
Vienna, Austria

Intermediate

.NET

TypeScript

+1

Milly
Vienna, Austria

Intermediate

.NET

TypeScript

+1

Saby Company
Delebio, Italy

Remote

Intermediate

Node.js

Understanding the core capabilities of large language models

02:26 MIN

Understanding the core capabilities of large language models

Data Privacy in LLMs: Challenges and Best Practices

Introducing InstructLab for accessible LLM fine-tuning

01:12 MIN

Introducing InstructLab for accessible LLM fine-tuning

Unlocking the Power of AI: Accessible Language Model Tuning for All

How large language models are trained

08:05 MIN

How large language models are trained

Inside the Mind of an LLM

Addressing the core challenges of large language models

05:18 MIN

Addressing the core challenges of large language models

Accelerating GenAI Development: Harnessing Astra DB Vector Store and Langflow for LLM-Powered Apps

Addressing the key challenges of large language models

02:55 MIN

Addressing the key challenges of large language models

Large Language Models ❤️ Knowledge Graphs

Using large language models as a learning tool

03:42 MIN

Using large language models as a learning tool

Google Gemini: Open Source and Deep Thinking Models - Sam Witteveen

The training process of large language models

02:21 MIN

The training process of large language models

Google Gemini: Open Source and Deep Thinking Models - Sam Witteveen

Using RAG to enrich LLMs with proprietary data

03:17 MIN

Using RAG to enrich LLMs with proprietary data

RAG like a hero with Docling

Featured Partners

Inside the Mind of an LLM

Inside the Mind of an LLM

Emanuele Fabbiani

about 4 months ago • World Congress 2025

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

LLMOps-driven fine-tuning, evaluation, and inference with NVIDIA NIM & NeMo Microservices

Anshul Jindal

about 4 months ago • World Congress 2025

Unlocking the Power of AI: Accessible Language Model Tuning for All

Unlocking the Power of AI: Accessible Language Model Tuning for All

Cedric Clyburn & Legare Kerrison

about a year ago • World Congress 2024

Self-Hosted LLMs: From Zero to Inference

Self-Hosted LLMs: From Zero to Inference

Roberto Carratalá & Cedric Clyburn

about 4 months ago • World Congress 2025

Exploring LLMs across clouds

Exploring LLMs across clouds

Tomislav Tipurić

about 4 months ago • World Congress 2025

Give Your LLMs a Left Brain

Give Your LLMs a Left Brain

Stephen Chin

about a year ago • World Congress 2024

Three years of putting LLMs into Software - Lessons learned

Three years of putting LLMs into Software - Lessons learned

Simon A.T. Jiménez

about 4 months ago • World Congress 2025

Unveiling the Magic: Scaling Large Language Models to Serve Millions

Unveiling the Magic: Scaling Large Language Models to Serve Millions

Patrick Koss

about 4 months ago • World Congress 2025

Related Articles

View all articles

LM

Luis Minvielle

What Are Large Language Models?

Developers and writers can finally agree on one thing: Large Language Models, the subset of AIs that drive ChatGPT and its competitors, are stunning tech creations. Developers enjoying the likes of GitHub Copilot know the feeling: this new kind of te...

What Are Large Language Models?

BB

Benedikt Bischof

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Bas Geerdink who gave advice on MLOps.‍About the speaker:‍Bas is a programmer, scientist, and IT manager. At ING, he is responsible for the Fast...

MLops – Deploying, Maintaining And Evolving Machine Learning Models in Production

KD

Krissy Davis

The Best Large Language Models on The Market

Large language models are sophisticated programs that enable machines to comprehend and generate human-like text. They have been the foundation of natural language processing for almost a decade. Although generative AI has only recently gained popula...

The Best Large Language Models on The Market

DC

Daniel Cranney

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

IntroductionIn the ever-evolving landscape of artificial intelligence, the concept of "responsible AI" has emerged as a cornerstone for ethical and practical AI implementation. During the WWC24 Panel discussion, three eminent experts—Mina, Bjorn Brin...

Panel Discussion: Responsible AI in Practice - Real-World Examples and Challenges

From learning to earning

Jobs that call for the skills explored in this talk.

Machine Learning Expert

Acrolinx
Berlin, Germany

Senior

Automated Testing

Functional Testing

Machine Learning Engineer (m/f/d)

evoila Frankfurt GmbH
Mainz, Germany

Senior

Keras

DevOps

Tensorflow

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

ML Data Engineer - Object Detection & Active Learning

autonomous-teaming

Remote

NoSQL

NumPy

Pandas

Docker

Senior Machine Learning Engineer - LLM & Reinforcement Learning

Warmwind

€240K

Senior

PyTorch

Tensorflow

Machine Learning

Senior ML Engineer - NLP (f/m/x)

LiveEO GmbH
Berlin, Germany

Senior

Full-Stack & LLM Engineer

Neural Concept
Lausanne, Switzerland

Machine Learning

Machine Learning Engineer with Reinforcement Learning Expertise

Warmwind

€120K

PyTorch

Tensorflow

Machine Learning

AI & Embedded ML Engineer (Real-Time Edge Optimization)

autonomous-teaming

Remote

GIT

Linux

PyTorch