Sandra Ahlgrimm & Kevin Lewis

Bringing AI Model Testing and Prompt Management to Your Codebase with GitHub Models

Is your AI development just 'vibes-based'? Learn how to run automated prompt evaluations as a blocking check on every pull request.

Bringing AI Model Testing and Prompt Management to Your Codebase with GitHub Models
#1about 3 minutes

The challenge of testing non-deterministic AI features

Traditional development relies on rigorous testing, but AI features are often implemented based on intuition without a structured evaluation process.

#2about 5 minutes

Managing prompts as code with GitHub Models

GitHub Models integrates AI development into your repository by defining prompts, models, and parameters in a version-controlled YAML file.

#3about 6 minutes

Using evaluators to compare AI model variants

The platform allows you to run multiple prompt and model variations against a test dataset to compare outputs on metrics like latency, coherence, and similarity.

#4about 5 minutes

Consuming prompt files in your application code

Use the GitHub Models inference API or the Azure AI Inference SDK to load your version-controlled prompt files and integrate AI calls directly into your application.

#5about 2 minutes

Local development and testing with the CLI

The GitHub CLI extension allows you to run prompts and execute model evaluations directly from your terminal for rapid, local iteration before committing changes.

#6about 4 minutes

Automating repository tasks with AI-powered actions

Use GitHub Actions to automate common repository tasks like generating changelogs from pull requests, triaging bug reports, or creating weekly issue summaries.

#7about 1 minute

Implementing CI/CD for AI prompt changes

Integrate prompt evaluations into your CI/CD pipeline using GitHub Actions to automatically run tests and block pull requests that degrade model performance.

#8about 2 minutes

Adopting GitHub Models in existing projects

You can quickly convert existing prompt files to the GitHub Models format to gain access to powerful evaluation, comparison, and automation capabilities.

Related jobs
Jobs that call for the skills explored in this talk.

test

Milly
Vienna, Austria

Intermediate

test

Milly
Vienna, Austria

Intermediate

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
Exploring AI: Opportunities and Risks for Developers
In today's rapidly evolving tech landscape, the integration of Artificial Intelligence (AI) in development presents both exciting opportunities and notable risks. This dynamic was the focus of a recent panel discussion featuring industry experts Kent...
Exploring AI: Opportunities and Risks for Developers
CH
Chris Heilmann
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?
Join Scott Hanselman at WWC24 to explore AI's role as a superhero or supervillain. Scott shares his 32 years of experience in software engineering, discusses AI myths, ethical dilemmas, and tech advancements. Engage with his live demos and insights o...
WWC24 Talk - Scott Hanselman - AI: Superhero or Supervillain?

From learning to earning

Jobs that call for the skills explored in this talk.