Dora Petrella

How We Built a Machine Learning-Based Recommendation System (And Survived to Tell the Tale)

How do you find the perfect substitute for an out-of-stock item? Learn how we adapted a natural language model to solve this critical e-commerce challenge.

How We Built a Machine Learning-Based Recommendation System (And Survived to Tell the Tale)
#1about 5 minutes

Defining the business need for product recommendations

A recommendation system for substitute products is needed across multiple touchpoints to prevent lost sales from out-of-stock items.

#2about 2 minutes

Analyzing the limitations of the existing recommender

The previous system, based on the Jaccard coefficient, produced low-quality recommendations, particularly for new or unpopular items.

#3about 5 minutes

Using the Prod2Vec algorithm for recommendations

The Prod2Vec algorithm, adapted from Word2Vec, learns product relationships by analyzing co-occurrence within user session context windows.

#4about 2 minutes

Improving predictions with Meta-Prod2Vec and metadata

Incorporating product metadata like category and brand into the model (Meta-Prod2Vec) significantly improves recommendation quality for long-tail items.

#5about 2 minutes

Implementing the end-to-end MLOps pipeline

The production system uses dbt for data transformation, a Vertex AI pipeline for model training, and Elasticsearch for efficient vector similarity search.

#6about 3 minutes

Evaluating model performance with offline and online metrics

Offline metrics like NDCG confirmed model quality, while mirror traffic analysis showed a 45% increase in product recommendation coverage.

#7about 3 minutes

Visualizing product relationships with embedding projector

Using TensorFlow's Embedding Projector tool reveals how the model groups similar products into distinct clusters in a high-dimensional space.

#8about 3 minutes

Adopting pragmatic baselines and automated data analysis

Key project takeaways include using simple business-logic baselines for benchmarking and automating exploratory data analysis within the ML pipeline itself.

#9about 1 minute

Understanding the project team and final timeline

The project was completed in nine months by a cross-functional team of data engineers, data scientists, and software developers.

Related jobs
Jobs that call for the skills explored in this talk.

d

Saby Company
Delebio, Italy

Junior

test

Milly
Vienna, Austria

Intermediate

test

Milly
Vienna, Austria

Intermediate

Featured Partners

Related Articles

View all articles
DD
Dilek Demir
Data Science & more: The Lopez dilemma
Catwalk, Data Science, Hollywood, Google Images, Haute Couture, StackOverflow, Comfort Zone, Dota 2 and Versace – all these topics are connected and influenced by each other. Read here how and why!In 2000 Jennifer Lopez's green Versace dress went vi...
Data Science & more: The Lopez dilemma
CH
Chris Heilmann
WWC24 Talk - Brenda Romero - Stay: Surviving and Thriving in Tech
Brenda Romero discusses her tech career journey, overcoming burnout, and inspiring future game developers at WWC24.Here is what she had to say in the video:Hey everyone! Thanks for joining us!Reflections on a Rough YearLast year, I gave a talk about ...
WWC24 Talk - Brenda Romero - Stay: Surviving and Thriving in Tech

From learning to earning

Jobs that call for the skills explored in this talk.