Stanislas Girard

Chatbots are going to destroy infrastructures and your cloud bills

That simple AI feature is secretly a costly monolith. Learn how to separate fast and slow tasks before your cloud bill explodes.

Chatbots are going to destroy infrastructures and your cloud bills
#1about 3 minutes

Comparing web developers and data scientists before GenAI

Before generative AI, web developers focused on CPU-bound tasks and horizontal scaling while data scientists worked with GPU-bound tasks and vast resources.

#2about 3 minutes

The new AI engineer role and the RAG pipeline

The emergence of the AI engineer role combines web development and data science skills, often applied to building RAG pipelines for data ingestion and querying.

#3about 2 minutes

Key architectural challenges in building GenAI apps

Generative AI applications face unique architectural problems, including long response times, sequential bottlenecks, and the difficulty of mixing CPU and GPU-bound processes.

#4about 3 minutes

How a simple chatbot evolves into a large monolith

Adding features like document ingestion and web scraping to a simple chatbot can rapidly increase its resource consumption and Docker image size, creating a complex monolith.

#5about 4 minutes

Refactoring a monolithic AI app into a service architecture

To manage complexity and cost, a monolithic AI application should be refactored by separating user-facing logic from heavy background tasks into distinct, independently scalable services.

#6about 3 minutes

Choosing the right architecture for your application's workload

A monolithic architecture is suitable for low or continuous workloads, while a service-based approach is necessary for applications with high or spiky traffic to manage costs and scale effectively.

#7about 2 minutes

Overlooked challenges of running AI applications in production

Beyond core architecture, running AI in production involves complex challenges like managing GPUs on Kubernetes, model versioning, data compliance, and testing non-deterministic outputs.

#8about 2 minutes

Using creative evaluations and starting with small models

A creative evaluation using a game like Street Fighter reveals that smaller, faster LLMs can outperform larger ones for many use cases, making them a better starting point.

Related jobs
Jobs that call for the skills explored in this talk.

test

Milly
Vienna, Austria

Intermediate

test

Milly
Vienna, Austria

Intermediate

job ad

Saby Company
Delebio, Italy

Intermediate

Featured Partners

Related Articles

View all articles
DC
Daniel Cranney
Stephan Gillich - Bringing AI Everywhere
In the ever-evolving world of technology, AI continues to be the frontier for innovation and transformation. Stephan Gillich, from the AI Center of Excellence at Intel, dove into the subject in a recent session titled "Bringing AI Everywhere," sheddi...
Stephan Gillich - Bringing AI Everywhere
EM
Eli McGarvie
13 NEW AI Tools That Use ChatGPT 🤯
Our dear friend Bill Gates has recently suggested that the ChatGPT revolution is as big as the invention of mobile phones and the internet. So we thought it would be interesting to put together a list of all the useful applications that are powered b...
13 NEW AI Tools That Use ChatGPT 🤯
KD
Krissy Davis
Is ChatGPT Getting Worse Over Time?
OpenAI launched ChatGPT-3 at the end of 2022, and while most would agree that it's by far the best model available, a few people have been noticing a change in the output quality. Now, we want to preface this discussion by saying that ChatGPT is stil...
Is ChatGPT Getting Worse Over Time?

From learning to earning

Jobs that call for the skills explored in this talk.

AI/ML Engineer

AI/ML Engineer

Licorne Society

GIT
CMake
PyTorch
Computer Vision
Machine Learning