Jodie Burchell

Vectorize all the things! Using linear algebra and NumPy to make your Python code lightning fast.

We took a naive KNN implementation and made it 1000x faster. The secret wasn't a new language, but vectorized NumPy operations.

Vectorize all the things! Using linear algebra and NumPy to make your Python code lightning fast.
#1about 3 minutes

Why Python loops become slow at scale

Traditional loops over lists become a major performance bottleneck when processing large amounts of data.

#2about 5 minutes

Representing data with vectors, matrices, and NumPy arrays

Learn the fundamentals of linear algebra, where data points are vectors, datasets are matrices, and NumPy arrays provide the data structure.

#3about 3 minutes

A high-level overview of the KNN algorithm

The k-nearest neighbors algorithm classifies data points by finding the most common label among their closest neighbors in a vector space.

#4about 4 minutes

Coding a slow baseline KNN with Python lists

A walkthrough of an unoptimized k-nearest neighbors implementation demonstrates the severe performance issues caused by nested loops.

#5about 3 minutes

Vectorizing the distance calculation to remove a loop

Replace the inner loop for calculating Manhattan distance with a single vectorized subtraction operation on NumPy arrays.

#6about 9 minutes

Eliminating nested loops with NumPy array broadcasting

Use array reshaping and broadcasting to perform all pairwise distance calculations simultaneously, avoiding explicit replication and nested loops.

#7about 3 minutes

Optimizing neighbor selection with NumPy sorting and slicing

Gain final performance improvements by replacing Python's list sorting with NumPy's faster sorting algorithms and efficient array slicing.

#8about 6 minutes

How memory layout makes NumPy arrays so fast

NumPy arrays are faster because they store data in a contiguous block of memory, which is more efficient for the CPU to process than the scattered memory of Python lists.

#9about 22 minutes

Audience Q&A on performance and data science

Answers common questions about NumPy's underlying C implementation, hyperparameter tuning, memory management, and career paths in data science.

Related jobs
Jobs that call for the skills explored in this talk.

test

Milly
Vienna, Austria

Intermediate

test

Milly
Vienna, Austria

Intermediate

Featured Partners

Related Articles

View all articles
LM
Luis Minvielle
The 13 Best Python Libraries for Developers in 2025
Python still stands as one of the three most popular programming languages because it’s incredibly useful for data scraping, data engineering, and data analysis — meaning non-programmers that are handy with numbers, such as accountants or Economics B...
The 13 Best Python Libraries for Developers in 2025
CH
Chris Heilmann
Dev Digest 134 - Where pixels sing?
News and ArticlesWeAreDevelopers LIVE Data and Security Day is on Wednesday, 25/09/2024. Learn about OPC UA Updates, Best Practices for Using GitHub Secrets, Passwordless Web 1.5, Emerging AI Security Risks, Data Privacy in LLMs and get a chance to t...
Dev Digest 134 - Where pixels sing?

From learning to earning

Jobs that call for the skills explored in this talk.