Testing AI/ML systems like a PRO

Top repository with 27 educational projects. Building your Second Brain AI assistant (our new FREE course).

Paul Iusztin

Mar 08, 2025

This week’s topics:

Testing AI/ML systems like a PRO
Top repository with 27 educational projects
Building your Second Brain AI assistant (our new FREE course)

Quick guide on testing AI/ML apps

A quick guide on everything you have to know about testing AI/ML apps ↓

The goal is to:

test the ML app across 3 dimensions: data, model, and code
ensure that it's well integrated with external services
check expected requirements (e.g., latency)

In the development cycle, 6 primary types of tests are commonly employed at various stages:

Unit tests: focus on individual components with a single responsibility.
Integration tests: evaluate the interaction between integrated components or units within a system, such as how a feature engineering pipeline is integrated with the feature store.
System tests: rigorously evaluate the end-to-end functionality of the system, including performance, security, and overall user experience.
Acceptance tests: designed to confirm that the system meets specified requirements.
Regression tests: check for previously identified errors to ensure that new changes do not reintroduce them.
Stress tests: evaluate the system's performance and stability under extreme conditions.

What do we test?

You take a component and treat it as a black box.

What you have control over is the input and output. Test that you get an expected output for a given input:

inputs: data types, format, length, edge cases
outputs: data types, formats, exceptions

Tools: Pytest (you don't need anything else)

Test examples

Your data validity code usually runs when raw data is transformed into features.

Thus, by writing integration or system tests for your feature pipeline, you can check that your system responds appropriately to valid and invalid data.

You can check for length, character encoding, language, and special characters when working with unstructured data such as text.

Model tests are the trickiest, as models are non-deterministic, and AI apps can be successfully run without throwing any errors (while producing incorrect results).

Standard model tests:

the shapes of the input and model output tensors;
that the loss decreases after one batch (or more) of training;
overfit on a small batch, and the loss approaches 0;

All the tests are triggered inside the CI pipeline.

You can also perform behavioral testing on your model:

invariance: Changes in your input should not affect the output.
directional: Changes in your input should affect the outputs.
minimum functionality: The most simple combination of inputs and expected outputs.

🔗 For a deep dive, consider reading our article from Decoding ML ↓

The 6 MLOps foundational principles

Paul Iusztin

September 28, 2024

Read full story

Top repository with 27 educational projects

The best way to learn AI production is by building your tools from scratch. Here is a repository with 27 projects and step-by-step tutorials.

To build production-ready AI, you MUST have good SWE skills. No excuse.

The best way to acquire these skills is by building complex apps.

Here is where the build-your-own-x GitHub repository can help you out.

It is in the top 10 most popular on GitHub, having 345k 🌟 (created and maintained by CodeCrafters.io)

345k GitHub 🌟... That's wild!

No alternative text description for this image — Build your own <insert-technology-here> GitHub repository

It contains free tutorials on how to code from scratch tools/apps such as:

git
shell
Docker
BitTorrent
3D Renderer
Operating system
Blockchain

...in multiple languages such as Python, Node, Java, Rust, C++ and more.

Love it!

→💻 Repository

To take it to the next level, following the same “build from scratch” methodology.

CodeCrafters offers project-based courses teaching you how to build more complex tools such as Redis, SQLite and Kafka with interactive:

feedback
instructions
Q&A section

I tried their Redis series and 100% recommend it (as do other developers from Google, Nvidia, Meta, and more!)

🔗 If you are considering subscribing to CodeCrafters, use my link for 40% off:

Subscribe (40% off)

CodeCrafters lets you 100% reimburse your subscription through your corporate L&D budget.

Building your Second Brain AI assistant (our new FREE course)

I've created 5 popular open-source courses and a bestselling book.

But this project is my best work to date:

Building Your Second Brain AI Assistant course

This course will guide you through building a personal AI assistant that connects to your notes, research, and digital resources.

Think of it as a Notion-like assistant powered by advanced AI techniques, such as LLMs, agents, and RAG.

In just 6 comprehensive modules, you’ll master:

Architecting production-ready LLM and agent systems.
Implementing advanced RAG pipelines for AI assistants.
Using LLMOps best practices to fine-tune and deploy your models.
Working with cutting-edge tools like OpenAI, Hugging Face, MongoDB, ZenML, Opik and Unsloth.

You’ll be building an AI research assistant that can:

Chat with your Second Brain
Generate paragraphs or answer questions based on your research and notes
Summarize documents
Deliver insights based on your own knowledge base

And thanks to our amazing sponsors, we've kept it 100% free.

Thank you:

If you know anything about my resources, you'll know this isn’t just a theory course.

I love to get hands-on!

Thus, you'll be architecting and implementing real-world systems.

While structuring your Python code as you would at your job, using Python modules, uv and ruff (no more one-file modules that are never present in the industry).

If you’re ready to take your skills to the next level and build your own personal AI assistant, this course will get you there.

Get started by checking out its GitHub page with all the preparation details:

GO TO COURSE

Let’s build your Second Brain AI assistant together!

Whenever you’re ready, there are 3 ways we can help you:

Perks: Exclusive discounts on our recommended learning resources
(books, live courses, self-paced courses and learning platforms).
The LLM Engineer’s Handbook: Our bestseller book on teaching you an end-to-end framework for building production-ready LLM and RAG applications, from data collection to deployment (get up to 20% off using our discount code).
Free open-source courses: Master production AI with our end-to-end open-source courses, reflecting real-world AI projects and covering everything from system architecture to data collection, training and deployment.

Images

If not otherwise stated, all images are created by the author.

The 6 MLOps foundational principles

Discussion about this post

Ready for more?