PI #005: Harnessing the Strength of the Batch Architecture in Serving ML Models
Serving an ML Model Using a Batch Architecture. The Perfect DUO: FastAPI + Streamlit
This newsletter aims to give you weekly insights about designing and productionizing ML systems using MLOps good practices ๐ฅ.
This week I will go over the following:
Why serving an ML model using a batch architecture is so powerful?
The Perfect DUO: FastAPI + Streamlit
Also, I have some exciting news to share with you guys ๐
Why serving an ML model using a batch architecture is so powerful?
Why serving an ML model using a batch architecture is so powerful?
When you first start deploying your ML model, you want an initial end-to-end flow as fast as possible.
Doing so lets you quickly provide value, get feedback, and even collect data.
But here is the catch...
Successfully serving an ML model is tricky as you need many iterations to optimize your model to work in real-time:
- low latency
- high throughput
Initially, serving your model in batch mode is like a hack.
By storing the model's predictions in dedicated storage, you automatically move your model from offline mode to a real-time online model.
Thus, you no longer have to care for your model's latency and throughput. The consumer will directly load the predictions from the given storage.
๐๐ก๐๐ฌ๐ ๐๐ซ๐ ๐ญ๐ก๐ ๐ฆ๐๐ข๐ง ๐ฌ๐ญ๐๐ฉ๐ฌ ๐จ๐ ๐ ๐๐๐ญ๐๐ก ๐๐ซ๐๐ก๐ข๐ญ๐๐๐ญ๐ฎ๐ซ๐:
- extracts raw data from a real data source
- clean, validate, and aggregate the raw data within a feature pipeline
- load the cleaned data into a feature store
- experiment to find the best model + transformations using the data from the feature store
- upload the best model from the training pipeline into the model registry
- inside a batch prediction pipeline, use the best model from the model registry to compute the predictions
- store the predictions in some storage
- the consumer will download the predictions from the storage
- repeat the whole process hourly, daily, weekly, etc. (it depends on your context)
The main downside of deploying your model in batch mode is that the predictions will have a level of lag.
For example, in a recommender system, if you make your predictions daily, it won't capture a user's behavior in real-time, and it will update the predictions only at the end of the day.
That is why moving to other architectures, such as request-response or streaming, will be natural after your system matures in batch mode.
So remember, when you initially deploy your model, using a batch mode architecture will be your best shot for a good user experience.
Let me know in the comments what your strategy is.
Want to ๐น๐ฒ๐ฎ๐ฟ๐ป ๐ ๐ & ๐ ๐๐ข๐ฝ๐ ๐ถ๐ป ๐ฎ ๐๐๐ฟ๐๐ฐ๐๐๐ฟ๐ฒ๐ฑ ๐๐ฎ๐? After 6 months of work, I finally finished ๐๐ฉ๐ฆ ๐๐ถ๐ญ๐ญ ๐๐ต๐ข๐ค๐ฌ 7-๐๐ต๐ฆ๐ฑ๐ด ๐๐๐๐ฑ๐ด ๐๐ณ๐ข๐ฎ๐ฆ๐ธ๐ฐ๐ณ๐ฌ Medium series.
In 2.5 hours of reading & video materials, you will learn how to:
- design a batch-serving architecture
- use Hopsworks as a feature store
- design a feature engineering pipeline that reads data from an API
- build a training pipeline with hyper-parameter tunning
- use W&B as an ML Platform to track your experiments, models, and metadata
- implement a batch prediction pipeline
- use Poetry to build your own Python packages
- deploy your own private PyPi server
- orchestrate everything with Airflow
- use the predictions to code a web app using FastAPI and Streamlit
- use Docker to containerize your code
- use Great Expectations to ensure data validation and integrity
- monitor the performance of the predictions over time
- deploy everything to GCP
- build a CI/CD pipeline using GitHub Actions
- trade-offs & future improvements discussion
๐ฌ๐ผ๐ ๐ฐ๐ฎ๐ป ๐ฎ๐ฐ๐ฐ๐ฒ๐๐ ๐๐ต๐ฒ ๐ฐ๐ผ๐๐ฟ๐๐ฒ ๐ผ๐ป:
โ ๐๐ฆ๐ฅ๐ช๐ถ๐ฎ'๐ด ๐๐๐ ๐ฑ๐ถ๐ฃ๐ญ๐ช๐ค๐ข๐ต๐ช๐ฐ๐ฏ: text tutorials + videos
โ ๐๐ช๐ต๐๐ถ๐ฃ: open-source code + docs
I published the course on Medium's TDS publication to make it accessible to as many people as people. Thus ๐
... anyone can learn the fundamentals of MLE & MLOps.
So no more excuses. Just go and build your own project ๐ฅ
๐ GitHub Code.
๐ Course on Medium.
The Medium link above might not work well on mobile. So here are the first lessons (you will find the rest of them within the articles):
๐ Lesson 1: A Framework for Building a Production-Ready Feature Engineering Pipeline
๐ Lesson 2: A Guide to Building Effective Training Pipelines for Maximum Results
๐ Lesson 4: Unlocking MLOps using Airflow: A Comprehensive Guide to ML System Orchestration
I worked hard to provide you with a seamless experience while doing the course. Let me know on LinkedIn if you have any questions and what was your experience. Thanks โ๐ป
The Perfect DUO: FastAPI + Streamlit
2 tools you should know as an ML Engineer
Here are 2 reasons why FastAPI & Streamlit should be in your MLE stack ๐
#๐ญ. ๐ฃ๐๐๐ต๐ผ๐ป, ๐ฃ๐๐๐ต๐ผ๐ป, ๐ฃ๐๐๐ต๐ผ๐ป!
As an MLE, Python is your magic wand.
Using FastAPI & Streamlit, you can build full-stack web apps using solely Python.
#๐ฎ. ๐๐
๐๐ฟ๐ฒ๐บ๐ฒ๐น๐ ๐ณ๐น๐ฒ๐
๐ถ๐ฏ๐น๐ฒ
Using FastAPI & Streamlit, you can deploy an ML model in almost any scenario.
<< ๐๐ข๐ต๐ค๐ฉ >>
Expose the predictions from any storage, such as S3 or Redis, using FastAPI as REST endpoints.
Visualize the predictions using Streamlit by calling the FastAPI REST endpoints.
<< ๐๐ฆ๐ฒ๐ถ๐ฆ๐ด๐ต-๐๐ฆ๐ด๐ฑ๐ฐ๐ฏ๐ด๐ฆ >>
Wrap your model using FastAPI and expose its functionalities as REST endpoints.
Yet again... visualize the predictions using Streamlit by calling the FastAPI REST endpoints.
<< ๐๐ต๐ณ๐ฆ๐ข๐ฎ >>
Wrap your model using FastAPI and expose it as REST endpoints.
But this time, the REST endpoints will be called from a Flink or Kafka Streams microservice.
.
Using this tech stack won't be the most optimal solution in 100% use cases,
... but in most cases:
- it will get the job done
- you can quickly prototype almost any ML application.
So rememberโฆ
You should learn FastAPI & Streamlit because:
- Python all the way!
- you can quickly deploy a model in almost any architecture scenario
Do you use FastAPI & Streamlit?
To learn more, check out Lesson 6 of my MLE & MLOps course: ๐๐ข๐ด๐ต๐๐๐ ๐ข๐ฏ๐ฅ ๐๐ต๐ณ๐ฆ๐ข๐ฎ๐ญ๐ช๐ต: ๐๐ฉ๐ฆ ๐๐บ๐ต๐ฉ๐ฐ๐ฏ ๐๐ถ๐ฐ ๐ ๐ฐ๐ถ ๐๐ถ๐ด๐ต ๐๐ฏ๐ฐ๐ธ ๐๐ฃ๐ฐ๐ถ๐ต.
Also, I want to let you know that DeepLearning.ai will soon release "The AI for Good Specialization"
"It is a beginner-friendly 3-course program that will teach you how to combine human and machine intelligence to create a positive social impact."
I am happy and excited that they take this stuff seriously and educate people about when & if you should use AI.
...and, of course, about its risks.
Pre-enroll now and get 14 free days.
This is not a promotional message. I am just excited to see something like this exists.
See you next week on Thursday at 9:00 am CET.
Have a fabulous weekend!
๐ก My goal is to help machine learning engineers level up in designing and productionizing ML systems. Follow me on LinkedIn and Medium for more insights!
๐ฅ If you enjoy reading articles like this and wish to support my writing, consider becoming a Medium member. Using my referral link, you can support me without extra cost while enjoying limitless access to Medium's rich collection of stories.
Thank you โ๐ผ !