Decoding ML #011: My Ideal ML Engineering Tech Stack
Supercharge Your ML System: Use a Model Registry. My Ideal ML Engineering Tech Stack.
Hello there, I am Paul Iusztin, and within this newsletter, I will deliver your weekly piece of MLE & MLOps wisdom straight to your inbox ๐ฅ
Hello ML builders ๐
This week we will cover the following topics:
Supercharge Your ML System: Use a Model Registry
My Ideal ML Engineering Tech Stack
+ [Bonus] Something extra for you.
But first, I want to let you know something.
โ> If you want to learn ML & MLOps in a structured way but are too busy to take an entire course, then I wrote the perfect article for you.
A โ14-minute readโ preview of my "๐ง๐ต๐ฒ ๐๐๐น๐น ๐ฆ๐๐ฎ๐ฐ๐ธ ๐ณ-๐ฆ๐๐ฒ๐ฝ๐ ๐ ๐๐ข๐ฝ๐ ๐๐ฟ๐ฎ๐บ๐ฒ๐๐ผ๐ฟ๐ธ" course that explains how all the puzzle pieces (aka architecture components) work together.
It gives a high-level overview of how to design:
- a batch architecture
- feature, training, and inference pipelines
- orchestration
- data validation & monitoring
- web app using FastAPI & Streamlit
- deploy & CI/CD pipeline
- adapt the batch architecture to an online system
โ> Check it out: The Full Stack 7-Steps MLOps Framework Preview

#1. Supercharge Your ML System: Use a Model Registry
A model registry is the holy grail of any production-ready ML system.
The model registry is the critical component that decouples your offline pipeline (experimental/research phase) from your production pipeline.
๐๐ผ๐บ๐ฝ๐๐๐ฒ ๐ข๐ณ๐ณ๐น๐ถ๐ป๐ฒ ๐๐ฒ๐ฎ๐๐๐ฟ๐ฒ๐
Usually, when training your model, you use a static data source.
Using a feature engineering pipeline, you compute the necessary features used to train the model.
These features will be stored inside a features store.
After processing your data, your training pipeline creates the training & testing splits and starts training the model.
The output of your training pipeline is the trained weights, also known as the model artifact.
๐๐ฒ๐ฟ๐ฒ ๐ถ๐ ๐๐ต๐ฒ๐ฟ๐ฒ ๐๐ต๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น ๐ฟ๐ฒ๐ด๐ถ๐๐๐ฟ๐ ๐ธ๐ถ๐ฐ๐ธ๐ ๐ถ๐ป ๐
This artifact will be pushed into the model registry under a new version that can easily be tracked.
Since this point, the new model artifact version can be pulled by any serving strategy:
#1. batch
#2. request-response
#3. streaming
Your inference pipeline doesnโt care how the model artifact was generated. It just has to know what model to use and how to transform the data into features.
Note that this strategy is independent of the type of model & hardware you use:
- classic model (Sklearn, XGboost),
- distributed system (Spark),
- deep learning model (PyTorch)
To summarize...
Using a model registry is a simple and effective method to:
-> detach your experimentation from your production environment,
regardless of what framework or hardware you use.
To learn more, check out my practical & detailed example of how to use a model registry in my article: A Guide to Building Effective Training Pipelines for Maximum Results
#2. My Ideal ML Engineering Tech Stack
Here it is ๐
- Python: your bread & butter
- Rust: code optimization
- Sklearn + XGBoost: classic ML
- PyTorch: deep learning
- FastAPI: REST APIs
- Streamlit: UI
- Terraform: infrastructure
- Kafka: streaming
- Docker: containerize
- Kubernetes: horizontal scaling
- GitHub Actions: CI/CD
- Airflow: orchestrating
- AWS: cloud
- Hopsworks: feature store
- W&B: experiment tracking, model & artifact registry
- DVC: data versioning
- Arize: Observability
If that sounds like a lot... it is...
Sometimes finding your way out of this tool's labyrinth is a struggle.
But hey, at the end of the day, I have a lot of fun working with them.
Note that in some scenarios, the tool you use depends a lot on the context. For example, you might use ZenML instead of Airflow for orchestration.
That is perfectly fine. The most important is to know what you know (e.g., orchestration). Afterward, you can quickly do your own research and pick the best tool for the job.
[Bonus] New 1-hour Free MLOps Course by DeepLearning.ai
This one-hour course is perfect for you if you want a quick intro to MLOps around generative AI.
DeepLearning.AI just released a ~1-hour MLOps course that will show you how to use W&B as your MLOps tool for:
- diffusion models
- LLMs
I recently skimmed through it, and it is worth it.
๐ Evaluating and Debugging Generative AI
Thatโs it for today ๐พ
See you next Thursday at 9:00 am CET.
Have a fantastic weekend!
Paul
Whenever youโre ready, here is how I can help you:
The Full Stack 7-Steps MLOps Framework: a 7-lesson FREE course that will walk you step-by-step through how to design, implement, train, deploy, and monitor an ML batch system using MLOps good practices. It contains the source code + 2.5 hours of reading & video materials on Medium.
Machine Learning & MLOps Blog: here, I approach in-depth topics about designing and productionizing ML systems using MLOps.